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Length of Therapy in Relation to Counselor Estimates 
of Personal Integration and Other ‘Case Variables’ 


Stanley W. Standal and Ferdinand van der Veen 
University of Chicago 


The relationship between case length and 
the quality or extent of psychotherapeutic 
change in client-centered therapy is virtually 
unknown. In a study of 23 client-centered 
therapy cases, Seeman (18) found a trend in 
favor of higher success ratings by the thera- 
pist for longer cases. The shorter cases 
spanned the entire range of success ratings 
from complete failure, 1, to marked success, 
9, whereas the success ratings for longer cases 
clustered about two high points on the scale 
(points 7 and 8). The variability of ratings 
was significantly lower for the long case 
group. Seeman concluded that further con- 
firmation of these findings could mean that if 
a Client is in therapy for at least twenty in- 
terviews, there is a strong assurance of gain 
from therapy, as judged by the counselor. 


In an analysis of 78 client-centered cases, - 


Cartwright (4) found support for the trend 
noted by Seeman. In addition, he found that 
rated success as a function of number of in- 
terviews displayed two curvilinear compo- 
nents, one for short cases and one for longer 
cases, For 44 cases having less than 19 inter- 
views, the correlation ratio (eta) of success 
rating on number of interviews was .66, which 
was significant at better than the 01 level. 
For the 42 cases with 14 or more interviews 
the correlation ratio was .67, which also was 
significant at better than the .01 level of 
confidence. 

In view of the clinical as well as theoreti- 
cal importance which generally attaches to 
case length, simpler and/or more definitive 


1 This work was supported in part by a research 
grant (PHS M 903) from the National Institute of 
Mental Health, of the National Institutes of Health, 
United States Public Health Service. 


relationships between this variable and meas- 
ures of therapeutic process and outcome might 
be expected. In the present investigation, esti- 
mated changes in personality integration, life 
adjustment, and other case variables will be 
studied as additional factors possibly related 
to case length. 

The several case variables under considera- 
tion are derived from Seeman’s case rating 
scale (18) which comprises ten items designed 
to assess various aspects of the process and 
outcome of client-centered therapy.” All items 
are rated on a scale from 1 to 9. The first 
eight items require a beginning of therapy 
and end of therapy rating and are as follows: 


Item 1. The degree to which therapy was an intel- 
lectual-cognitive process for the client. Little or none 
(1) to maximally or exclusively (9). 

Item 2. The degree to which therapy was an emo- 
tional-experiential process for the client. Little or 
none (1) to maximally or exclusively (9). 

Item 3. The degree to which the client perceived 
therapy as a process of personal exploration or as 
specific analysis of life-situations. Situational (1) to 
personal exploration (9). 

Item 4. The degree to which the client used the re- 
lationship itself as a focus for therapy. Negligible 
extent (1) to maximally (9). 

Item 5. Estimate of the client’s attitude toward 
you during the course of therapy. Strong dislike (1) 
to strong liking or respect (9). 

Item 6. Estimate of your feelings toward the client. 
Strong dislike (1) to strong liking or respect (9). 

Item 7. The degree of personal integration of the 
client. Highly disorganized or defensively organized 
(1) to optimally integrated (9). 

Item 8. The life adjustment of the client. Low (1) 
to high (9). 


The last two items require only an end of 
therapy rating: 


? The scale was developed jointly by Drs. Julius 
Seeman and Nathaniel J. Raskin. 


Item 9. The degree of satisfaction of the client 
with the outcome of therapy. Strongly dissatisfied 
(1) to extremely satisfied (9). 

Item 10. Your rating of the outcome of therapy. 
Complete failure (1) to marked success (9). 


A clinical appraisal of the various items ` 


suggests that the variable most likely to be 
related to case length is personal integration. 
As described in Item 7 it implies personality 
reorganization, which seems to be regarded 
by psychotherapists of most persuasions as a 
process which is both long and gradual. 
Client-centered therapists have tended to be 
less concerned with case length, but the writ- 
ings of theorists in this approach undoubtedly 
convey the impression that personality reor- 
ganization is an extensive and gradual proc- 
ess. For example, Rogers wrote: 


The best definition of what constitutes integration 
appears to be this statement that all the sensory and 
visceral experiences are - admissible to awareness 
through accurate symbolization, and organizable 
into one system which is internally consistent and 


which is, or is related to, the structure of self (15, 
PP. 513-514). 


And in describing the process of becoming 
integrated: 

Exploration of e 
counselor, 
step of its 
exhibit, it seems possible 
at a “safe” rate, 
slowly and tentai 

Gradually he 
fact that he is 
522). 


xperience is made possible by the 
and since the self is accepted at every 
exploration and in any change it may 
gradually to explore areas 
and hitherto denied experiences are 
tively accepted . . , (15, p. 518). 
[the client] comes to experience the 
making value judgments . . . (15, p. 


Standal (19) saw the fundamental unit of 
Personality reorganization as (a) the percep- 
tion by the cli 


e chent of the therapist’s attitude 
of unconditional positive regard in relation to 


some experience tentatively symbolized and 
expressed as “self”: (b) the transformation 
of this “external” 


positive regard to self-re- 
gard; and, (c) the generalization of this 
newly developed self-regard to similar or re- 
lated denied or distorted experiences, which, 
in turn, may then be tentatively symbolized 
and expressed as “self,” As to the number of 
such fundamental transactions, psychothera- 
peutic change is “. . . the fusion of thou- 
sands of instances of the process we have just 
described . . .” (19, p. 100). And a sum- 
maty statement which suggests the association 
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of personality reorganization with extent of 
therapeutic contact is the following: “The 
maladjusted individual has an extensive sys- 
tem of conditions of worth which are rela- 
tively impervious to anything but a sustained 
relationship characterized by positive regard 
transactions as extensive as those upon which 
the existing self-regard structure was built 
19: p: 111); 
í One ae theorist has taken 4 
position which implies that personality change 
may not take so long as is ordinarily be- 
lieved. Analyzing the processes of psycho- 
therapy and personality reorganization 10 
terms of learning and perception theng 
Butler (1) argued that when the thean 
consistently communicates understanding ia 
acceptance from the very beginning of ri 
apy, he is less likely to arouse many ti re 
consuming and perhaps unnecessary TR is 
ance reactions in the client. The implication 
that client-centered therapy may require oe 
time for a given degree of poral ra 
ganization than does an approach ee Bis 
volves a period of relatively passive beha pet 
on the part of the therapist followed by “on 
riod of systematic interpretation. Neve apet 
less, it is clear throughout most of the P oe 
that Butler still sees personality Teorema 
tion as a gradual and time-consuming p° 
even under optimal circumstances. a by 
Changes in the variables represente ften 
the other items of the rating scale may ah ee 
be closely related to time, but unlike ae 
in personal integration they may eas! z pe- 
envisaged as occurring over very | A 
riods. For example, although one usually le 
pects therapy to become a more emo es 
experiential process (Item 2), with C ina 
clients or certain client-counselor com Lex 
tions therapy may be heavily emotion? 
periential from the beginning. Similarly Tor 
the liking or respect the client and coun 


Al 
have for each other (Items 5 and 6); and 
though mutual respect between clien’ ing 


counselor is likely to grow with incre aa 
therapeutic contact, they will often like A 
respect each other immediately, or 81° the 
increase mutual liking and respect on > 
basis of a few interviews. Even movemen A 
ward life adjustment (Item 8), which is pi aly 


a very lengthy process, may proceed 1a 
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with some stroke of good fortune, or through 
a single fresh perception of some key prob- 
lem. Similar arguments can be advanced for 
each: of the other items, including the judg- 
ment of over-all success (Item 10), the most 
frequently studied item of the scale. “Suc- 
cess” has no specific referents but presumably 
is based upon a combination of case factors— 
those included in the scale as well as others 
x which may be highly idiosyncratic to the 
counselor. A short case can be judged as 
highly successful almost exclusively on fac- 
tors relatively independent of case length, 
€.g., rapport, insights achieved, displays of 
emotion, client satisfaction, and so forth. 

With the above considerations in mind, the 
following two hypotheses are advanced: 

1. Movement toward personal integration 
is positively related to length of therapy. ; 

2. Movement toward personal integration is 
more highly related to length of therapy than 
is change or outcome on any other of the 
nine case variables. 

Although the other nine case variables are 
not likely to be as closely related to length of 
therapy as movement toward personal inte- 
gration, clinical experience suggests that they 
are all somewhat dependent upon the amount 
of contact between client and therapist. Ac- 
cordingly, it is hypothesized that: 

3. All the other variables defined by the 
items of the case rating scale are related to 


length of therapy. 


——— 


Procedure 


Subjects. The data were taken from the 
cases of 73 research clients who were seen for 
at least two interviews at the Counseling Cen- 
ter, University of Chicago, during the period 
1949-1954. To include cases of one inter- 
view, as does Cartwright (4), has the ad- 
vantage of not excluding data which may be 
pertinent, but it can be argued that single in- 
terview cases may often be nothing more than 
“preliminary interviews” in which the client 
has simply sized up the situation and decided 
against entering therapy. When a rather high 
cutoff point is selected, as in the studies by 
Seeman (18) and others (17) where only 
cases of six or more interviews were used, the 


JE àa 


_, The sample of subjects is almost, but not quite, 
identical with that studied by Cartwright (4). 


chances of including pseudo-cases are greatly 
decreased, but the chances of excluding real 
Cases are increased considerably. In the pres- 
ent study, cases with but one interview 
were eliminated on the assumption that, even 
though the client had had a preliminary in- 
terview with a different counselor, the first 
interview. with the therapist proper consti- 
tutes a preliminary interview from the client’s 
‘point of view. Where the client continued 
with therapy, however, the first interview 
was assumed to have been therapeutic and 
hence justifiably included in case length. It 
also might be mentioned that the inclusion of 
single interview cases tends to raise slightly 
the correlations to be reported, so the exclu- 
sion of such cases leads to more conservative 
estimates of the relationships. 

The subjects were 42 males and 31 females, 
25 of whom were community clients and 48 
of whom were students. The mean age was 
26.7 with a standard deviation of 4.5. Al- 
though many of the clients were referred to 
the Center, all came of their own volition and 
Participated in the various research projects 
on the same basis. They were seen by 16 dif- 
ferent therapists, two of whom were females. 
The therapists ranged in experience from 
about one year to over 15 years of thera- 
peutic work. The largest proportion had from 
three to six years of experience. 

Case length. For this study case length, the 
independent variable, is the amount of time 
spent with the therapist and is measured in 
terms of number of interviews, each interview 
being slightly less than one hour long. The 
decision to end therapy was almost invariably 
left to the client. 

The distribution of case length is presented 
in Table 1, and is highly positively skewed, 
In order to fulfil the assumption of normality 
for the Pearson product-moment correlation 
coefficient and to obtain simpler relation- 
ships, the logarithm to the base 10 of the 
number of interviews was calculated for each 
case. The transformed distribution, as shown 
in Table 1, approximates normality. Also it 
was found that the transformation produced 
a more linear relationship between case length 
and movement toward personal integration, 
The transformed values were used in all sta- 
tistical calculations involving case length. In 
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Table 1 


Distributions of Cases by the Number of Interviews 
and by the Logarithm of the Number of 
Interviews of Each Case 


Logis of the 
Number of Number numberof Number 
interviews of cases interviews of cases 
2-8 17 .30- 49 1 
9-15 16 50- .69 3 
16-22 6 .70- .89 9 
23-29 1 .90-1.09 16 
30-36 5 1.10-1.29 10 
37-43 4 1.30-1.49 8 
44-50 4 1.50-1.69 12 
51-57 4 1.70-1.89 10 
58-64 3 1.90-2.09 2 
65-71 1 2.10-2.29 2 
72 plus (to 178) 6 
N=73 N = 73 
Mean = 30.69 Mean = 1.281 
SD = 32.53 SD = 0.428 
Median = 18 Median = 1.255 


Antilog of Mean = 19.1 
Antilog of Median = 18.0 


evaluating our results it should be remem- 
bered that this transformation makes differ- 
ences in case length for shorter cases much 
more important than corresponding differ- 
ences in length for longer cases. 

As has been indicated, the dependent vari- 
ables of this study were inferred from the 
case-rating scale used by Seeman (18). All 
ratings were made by the counselor at the 
termination of therapy. The first eight items 
required a rating of the client for the begin- 
ning and for the end of therapy. The differ- 
ence between these two values thus represents 
a movement score and was used in all calcu- 
lations for the first eight items. The last two 
items, client satisfaction and success, required 
a rating only for the end of therapy. Since 
counselor judgments were the only estimates 
of change or of outcome on all ten case vari- 
ables, their reliability and validity deserve 
considerable attention. 

Reliability. The reliability of counselor 
judgments is difficult to estimate accurately 
since even the simplest prerequisites for such 
estimates are either difficult or impossible to 
meet. If ratings are to be made’6n the basis 
of the total therapeutic situation, then inter- 


judge agreement cannot be found since only 
one therapist is intimately familiar with each 
case. Estimates of intrajudge consistency, al- 
though not impossible, involve considerable 
difficulty because of the nature of the scale 
as well as the subject matter. The shortness 
of the scale and the type of questions makes 
a split-half or alternate-forms approach rela- 
tively unfeasible. : 

A simple test-retest is, strictly speaking, 
impossible because the counselor cannot be 
exposed to the total therapeutic situation 
twice. The customary procedure is to read- | 
minister the scale as an approximation 0 
test-retest conditions. This procedure was fo P 
lowed in the present study but its limitations 
are apparent. The rating of a case involva 
thought and a decision as to the number k 
assign to a given item. If the second rating 
made after a short time, it may be bar 
largely on memory of the first rating, wo | 
ing in a spuriously high estimate of re ait 
ity. On the other hand, if the first and sa ja 
ratings span a time interval long enous the 
allow the first rating to be forgotten, at 
counselor may have also forgotten ss 
pects of the case. He may also have © 5 2 t 
considerably his frame of selerenee a 
therapy. These two factors would resu o pro" 
underestimation of reliability. Of the tw! ai 
cedures the latter is usually followed si? 
leads to an estimate of minim 


al reliability: hz 
Seeman (18) had seven cases rerate' 


un 
a mean interval of five months aa fo all 
mean correlation between judgments ma, 


items (using Fisher’s normalizing tane 2 u, 
tion) of .81. Cartwright (4) had seve necessi] 
selors rerate 15 cases on Item 10, § and. 
after a mean interval of 14.2 mon 
found a correlation of .86. e 
In the present study five counselors = w 
all ten items for 11 cases after a Mê 1a" pe 
terval of 34 months. Tabe 
correlations between these ratings. k 
for Items 7, 8, 9, and 10 ranged from fit 
excellent. For the other items the rel?" ace 
was not good or doubtful. In evaluating wt? 
results it should be noted that the ne atd 
small and that the time interval bennas A 
ings was very large. For the beginn”? ica 
ings the latter factor was even more me Bs 
than for the terminal ratings since 
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Table 2 
Reliability Coefficients Between First and Second Counselor Ratings for Beginning, Terminal, 
and Movement Scores on Ten Case Variables 
(N = 11, except for B and M coefficients for Item 4 where N = 10) 
Case Rating Scale Items 
Counselor = 
rating 1 2 3 4 5 Ca 7 8 9 10 
Beginni Qi 33 05 —.63* = — -50 i67* 
Terminale 36 .70** 28 -63* 41 61* .68* .67* .94*** 67% 
Movement .40 S5 —.27 54 — — <49 Ge 
*p< 05. 
web S iooi. 


terval between the start of therapy and the 
second rating was even longer. On several 
cases, and notably for beginning ratings on 
Items 5 and 6, some counselors could not re- 
member enough to attempt a second rating. 

Validity. Although demonstrations of the 
validity of counselor judgments based upon 
first order criteria are almost unavailable, re- 
lationships with many lesser order criteria 
have been established over the past few years. 
Citing earlier studies, Seeman (18) pointed 
to a significant relationship between counselor 
ratings of success and a rising ratio of posi- 
tive attitudes as therapy proceeds (14), a 
significant composite correlation (rko = 70) 
between counselor ratings and five experi- 
mentally independent process measures (13) b 
a significant correlation between case ratings 
and MMPI changes over therapy (12), a cor- 
respondence between Rorschach change and 
case rating (10), a correlation between case 
ratings and Rorschach changes significant at 
the 10 per cent level (11), and three findings 
of no relationship between Rorschach changes 
and counselor ratings (3, 9, 12). 

In evaluating these results it should be pointed 
out that the process measure, rise in positive atti- 
tudes, studied by Raimy (14), and the process meas- 
ures used in the five studies analyzed by Raskin 
(13) (ie, attitudes toward self, acceptance of and 
respect for self, understanding and insight, maturity 
of behavior reported by the client, and defensive- 
ness) were derived from transcriptions of the thera- 
peutic interviews and also are just the kinds of 
criteria a client-centered therapist might use in 
evaluating the success of a case. These two factors 


detract from the validity ee ae ee 


On the positive side, however, a body of 


n 
and theoretical knowledge supports the hotion iA) 


4 Héi bh, 
changes in these process measures are assaciatéd With - 


a healthier adjustment to life. The results thereby 
lend some indirect support to our confidence in the 
functional validity of counselor judgments. Simi- 
larly, the findings in two studies of Positive change 
on MMPI factors and Rorschach perceptions indi- 
cate that counselor judgments differentiate some kind 
of behavior which may in turn be related to a 
healthier personality reorganization and life adjust- 
ment. The implications of the Rorschach finding, 
however, are seriously mitigated by the four other 
Rorschach studies reporting nonsignificant results. 
In more recent studies, Butler and Haigh (2) dis- 
covered a significant increase in self-esteem, as in- 
ferred from the degree of congruence between self 
and self-ideal concepts, for a “definitely improved” 
group of clients as compared with a “not definitely 
improved” group and a control group. Counselor 
judgments of success was one of the two criteria of 
improvement. Dymond (5) found counselor judg- 
ments of success to be significantly related to ad- 
justment scores based on clinicians’ judgments of 
Q-sort statements. Gordon and Cartwright (7) also 
reported a significant correlation (rho = .60) be- 
tween rated success and these Q-adjustment state- 
ments. Vargas (21) found significant relationships 
(rhos ranging from .64 to -99) between judged suc- 
cess and six indices of increasing self-awareness, 


As support for the validity of counselor 
judgments the above findings are subject to 
the same kinds of limitations and advantages 
discussed previously with respect to process 
measures of client-centered therapy. The re- 
lation of the criterion variables to adjustment 
in everyday life is unknown, and the data 
upon which they are based, ie., client state- 
ments about self, are the'same kind as those 
available to the counselors and would be 
likely to influence judgments of success. 


of two other studies appear 
f being more independent of 


jo hawg, 'Resvaneh | 
counselor, ji . Dymond (6) found counselor 
RE A UE RBE o be significantly correlated 
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with ratings of adjustment based on TAT analyses 
after a follow-up period of six months. Tougas (20) 
found a relationship between rated improvement and 
degree of ethnocentrism, there being a significant 
likelihood that a client would be rated from 5 
through 9 on global success if his ethnocentrism 
score was at or below the mean upon entering 
therapy. A third study, with a criterion variable of 
about the same order as the above two, reported 
findings which are unequivocally nonsupportive. 
Grummon and John (8) found no correlation be- 
tween counselor judgments and mental health, as in- 
dicated by the ratings of TAT stories by a psycho- 
diagnostician on several scales based on a psycho- 
analytical conception of mental health. 

The clearest findings bearing on the functional va- 
lidity of counselor judgments are those of Rogers 
(16) who reported a significant relationship (r= 
41) between counselor ratings of success and inde- 
pendent ratings by two friends of the client on a 
scale designed to measure emotional maturity, an 
adaptation of the Willoughby Emotional Maturity 
Scale. The criterion variable was based on behaviors 
in everyday life which also are relatively unavail- 
able to the counselor. 

The findings cited above bear only on ratings of 
Success, i.e. Item 10 on the scale. As to the other 
items, Vargas (21) found five measures of develop- 
ing self-awareness to be significantly correlated (rs 
of .67 to .86) with ratings of personal integration 
(Item 7), and a sixth measure not significantly re- 
lated. Vargas (21) also reported negative correla- 
tions between personal integration and measures from 
four of the rating scales devised by Grummon and 
John (8). Rogers (16) noted a significant positive 
correlation (7 = .50) between the degree of change in 
Personal integration (Item 7) and the degree of 
change in maturity of behavior as seen by friends 
over the period of therapy. Over the period from 
before therapy to the follow-up point the correlation 
was higher (r = 67). Finally, Seeman (18) reported 
significant correlations (rs ranged from .20 through 
-89) between the first nine items and the success 


rating. 

To summarize the findings on validity, 
there is some evidence against and consider- 
ably more evidence for the belief that coun- 
selor judgments of success are fairly well re- 
lated to test, therapy, and social behaviors 
thought to þe indicative of success in psycho- 
therapy. In two studies counselor judgments 
of personal integration showed fair to high 
relationships with therapy behavior and fair 
to good relationships with maturity of be- 
havior in everyday life. One study showed 
negative correlations between ratings of per- 
sonal integration and ratings of test behav- 
ior. Another study reported low to very high 
relationships between rated success and coun- 
selor ratings on the nine other variables. 


Results and Discussion 


Case length and movement toward personal 
integration. The product-moment correlation 
between case length and movement toward 
personal integration (with number of inter- 
views transformed logarithmically) is .58, 
which is significant beyond the .001 level of 
confidence (Table 3). Figure 1 shows mean 
movement scores plotted against log number 
of interviews. The values for the points of the 
graph may be found in Table 4. 

j The ast hypothesis of this study, that 
movement toward personal integration is pos 
tively related to case length, is supported i 
may be said with considerable confidence © iy 
change in personal integration has a en 
good linear relationship with the logarithm 
case length. Conversely, case length ma 
regarded as a more meaningful variable 

it has hitherto appeared. f M. 

Movement toward personal angagra a 
all other case variables. Referring to Table ~» 
it will be seen that, although most of E 
other variables have significant correlatio"™” 
with log case length, movement toward 
sonal integration has the highest CS s 
The next highest is over-all success (7 a a 
The difference between .58 and the valu 


pe 
58). | 
31): 


of 


Table 3 


; Log 
Product-Moment Correlation Coefficients Bee 
Number of Interviews and Counselor/J udg 
of Movement or Outcome” on 
Ten Case Variables 


t 
Correla- 
tion with 
log t 
Item length N À 
02 
G 3 i 
1. Intellectual-cognitive a B ‘ol 
2. Emotional-experiential oe > 0 | 
3. Personal-situational 33 73 0 
4. Focus on relationship oe o | 
5. Client’s liking or respect z ma > 30) 
6. Therapist’s liking or respect ol 72 00 | 
7. Personal integration S E of 
8. Life adjustment a 72 o 
9. Client satisfaction a 12 0 
10. Global success 3 A 
alone: apy" 
d 10 judged as to outcome fr inertia. 
a a ideed for the beginning and for the end ited, By 
+ Although Several TE logically’ pools a be ee of Pree at 
a eels wratgnifcant correlation achieves sign! 
the .0S point. 
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MEAN MOVEMENT ON PERSONAL INTEGRATION 


Fig. 1. Movement on personal integration as a func- 
tion of log number of interviews for 72 cases. 


r for each of the other variables (using Fish- 
er’s transformations) is significant beyond the 
-01 level (¢ > 3.30). 

These results clearly support the second 
hypothesis under consideration, that change 
in personal integration is more highly related 
to case length than change or outcome on 
nine other case variables generally deemed 
important in client-centered therapy. Perhaps 
personal integration should be given a larger 
role in future studies of client-centered ther- 
apy. By and large the over-all success rating 
has been used as the major case variable 
where counselor judgments are concerned. If 
movement toward personal integration is more 
highly related to actual amount of therapy 
than is the estimated degree of success, it may 
be a more fruitful variable to study. 

Case length and movement or outcome on 


7 


other case variables. Seven case variables 
have low but significant correlations with log 
case length: success rating of the case = 
.37); change in the degree to which the client 
used the relationship itself as a focus for ther- 
apy (r= .33); change in the life adjustment 
of the client (r = .32); change in the degree 
to which therapy was an emotional-experien- 
tial process for the client (r = 32); change 
in the client’s attitude of liking or respect for 
the therapist (7 = .29); change in the degree 
to which therapy was an intellectual-cogni- 
tive process for the client (r = — 28); and 
the client’s satisfaction with the outcome of 
therapy (r= .23). Two case variables did 
not have significant correlations with case 
length: change in the therapist’s attitude of 
liking or respect for the client (7 = -18); and 
change in the degree to which the client per- 
ceived therapy as a process of personal ex- 
ploration as opposed to an analysis of life 
situations (r= .16). The manner in which 
the scores for the various items are distributed 
is shown in Table 4, which presents the mean 
movement or outcome scores for successive in- 
tervals of log case length for each of the 
items. 

The results support the third hypothesis, 
that the nine other case variables are related 
to case length, for seven of the variables, but 
not for two. As predicted, the relationships 
are significant but not strong. It seems clear 
that factors other than amount of therapeutic 
contact are largely responsible for change or 


Table 4 


Mean Movement or Outcome 


* Scores on Ten Case Variables for Successive Intervals of Log Case Length 


Log Number of Interviews 
(Raw Number of Interviews) 


CaseRating 30-69 70-89 90-109 1.10-1.29 1.30-1.49 1.50-1.69 1.70-1.89 1.90-2.09 

Scale Item OW (S (812) (13-19) (20-31) (32-49) (50-78) (79-124) A 
1 S TOK AS OS aa, cla S = 
2 2.25 1.00 0.63 140 250 225 3.30 re “ae 
3 2.50 1.33 1.75 1.40 2.50 2.50 2.40 0.50 5.00 
4 0.25 044 150 160 1350 183 — 220" 77 300 k 
5 0.50 071 1.37 120 3 175 280 0300 -00 
6 1.75 1.00 156 120 0.87 183 2430 250 R000 
7 0.25 1.11 1.81 OD SO 130 Sep Bay ee 
8 175 200 193 089 237 236 8 3.20250 PAo 
9 5.25 5.00 6.20 4.30 5.50 6.33 6.50 6.50 6.00 
10 S00 A a. a ee C00 ee ee 5.50 


* Items 9 and 10 were rated on outcome alone. 


Table 5 


Product-Moment Correlation Coefficients Between 
Movement on Item 7 (Personal Integration) 
and Movement on Items 1 Through 8 and 
Outcome on Items 9 and 10 


Correla- 
tion with 
Item Item7 WN p-value 
1. Intellectual-cognitive —42 72 O01 
2. Emotional-experiential 2: 72 2 
3. Personal-situational 2 A Œ 
4. Focus on relationship 26 72 05 
5. Client’s liking or respect 52 72 001 
6. Therapist’s liking or respect 3 W M 
8. Life adjustment 66 69 001 
9. Client satisfaction 43 71 O01 
10. Global success .67 72 001 


outcome along these nine various dimensions. 
Movement toward personal integration and 
other case variables. Table 5 presents the cor- 
relations between movement on personal inte- 
gration and the other case variables. No hy- 
potheses have been advanced concerning these 
relationships, but it is of interest to compare 
them with their individual correlations with 
log case length. Although movement on per- 
sonal integration correlated fairly well with 
several other items, these items did not cor- 
relate nearly as well with log case length. 
Case length and all other case variables. 
As an over-all test of the relationship be- 
tween case length and all other case variables 
a two-way analysis of variance was calculated 
for the data in Table 4.1 Table 6 presents the 
results of this analysis. The effect of length 
is highly significant (F = 6.39, p < .001), 
which lends additional support to the rele- 
vance of case length for the study of therapy. 


Summary and Conclusions 


On the assumption that case length should 
be more clearly related to therapeutic change 
than the results of previous studies indicated, 
it was compared with movement toward per- 
sonal integration, movement toward life ad- 
justment, over-all success, and several other 
variables derived from counselor judgments 
on 73 cases of two or more interviews in 


*The signs of the means of Item 1 were reversed 
to simplify the analysis. 
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which the client-centered approach was used. 
On clinical as well as theoretical grounds it 
was hypothesized that movement toward per- 
sonal integration would be more highly re- 
lated to case length than would any other 
case variable, but that all case variables would 
be related to case length. To fulfil the assump- 
tion of normality and to obtain more linear 
relationships the number of interviews for 
each case was transformed logarithmically. 
The reliability and validity of counselor 
judgments were discussed. Although the = 
sumptions for reliability estimates were nO 
met, a reasonable case was made for taking 
the agreement between two widely space 
administrations of the counselor rating scale 
as an estimate of minimal reliability. Esti- 
mates from this study, as well as previo 
ones, indicated a fair to good degree of Te i 
ability of counselor judgments on some E. 
and poor or no reliability on others. The a 
lidity of two of the items, success and pe 
sonal integration, was, in general, suppoY a 
For the validity of the other items there Ta 
too little evidence to warrant any conclusion: ‘ 
The two major hypotheses were fully ge 
ported by the results. The correlation aa 
tween movement toward personal intek Ae 
and log case length was .58, which was 
nificant at the .001 level of confidence. 
values of £ for the differences between, we 
correlation of movement toward personal É 
gration with log case length and the ee 
tions of the other items with log Jeng ant 
equal to or larger than 3.30, and signilic 
at less than the .01 level. 
The hypothesized relationshi 
length and the other case varia 


ps between cas? 
bles were SUP 


Table 6 


Summary of the Analysis of Variance for Sie 
Intervals of Log Case Length Versus Me: 
Scores on Ten Case Variables 


sive 


Sum of Vara 
tim: 
Source of variation squares af iat 
0262 
Between items 207.2360 9 me 
Between length intervals 37.7307 8 0 7377 
Within groups 53.1144 72 ; 
Total 298.0811 89 


*Length F = 6.39; m = 8, nı = 72; p <.001. 
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ported in the majority of instances. Low but 
significant correlations were found between 
log case length and seven variables. Two case 
variables correlated with log case length in 
the predicted direction, but the correlations 
were not significant. The F for the effect of 
log case length on all the variables was highly 
significant. 

The major conclusions were: (a) Change 
in level of personal integration is positively 
related to case length. Such change has a 
moderate linear relationship with log case 
length. (b) Change in level of personal inte- 
gration is more highly related to case length 
than change or outcome on other important 
case variables. (c) Most case variables are 
slightly related to length of therapy. (d) 
With respect to actual amount of therapy, 
change in personal integration may be more 
important than rated success or other case 
variables. (e) Case length can be a meaning- 
ful variable in the study of therapy. 


Received May 17, 1956. 
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The Assessment of Communication: The Relation ` 
of Clinical Improvement to Measured Changes 
in Communicative Behavior 


Betty L. Kalis and Lillian F. Bennett 


University of California School of Medicine and the Langley Porter Clinic 
San Francisco, California 


The assessment of recovery from mental 
illness which requires hospitalization may 
center on a patient’s sociological recovery, on 
his capacity to regain his place in the com- 
munity and to remain outside of the hospital, 
or on his psychological equilibrium, on the 
cohesiveness of his personality integration. 
Social recovery is relatively easy to ascertain 
on an actuarial basis after the patient has 
left the hospital. Psychological recovery is 
more difficult to measure, partly because ade- 
quate independent criteria of such recovery 
have not been established. Most difficult of 
all is the assessment of recovery while the 
patient is still in the hospital. It is possible 
to reach agreement about the extent of im- 
Provement, however, by inquiring of those 
who have contact with the patient, and it 
is on the basis of such inquiries that dis- 
charges from psychiatric hospitals are fre- 
quently made. The psychiatrist who sees the 
patient in therapy discusses changes he has 
observed in the interview. The nurses report 
changes in his behavior on the ward. The 
psychologist notes variations in his diagnos- 
tic test protocols. Perhaps the social worker 
brings data from relatives who have seen the 
patient for week-end visits. And the patient 
gives his own impressions of his readiness to 

1 This fifth study under a common title is part of 
a long-term research project on communication car- 
ried out at the Department of Psychiatry, Univer- 
sity of California School of Medicine, and the Lang- 
ley Porter Clinic, San Francisco, under the direction 
of Dr. Jurgen Ruesch. This investigation was sup- 
Ported in part by a research grant (M-534) from 


the National Institute of Mental Health pf the Na- 
tional Institutes of Health, Public Health Service. 


leave. After pooling all this information, g 


decision is made which is based on a a 
parison of the patient’s behavior with the tL 
havior of other patients who were discharge 
In reviewing all the criteria used in the a 
sessment of improvement, it would rei. 
that changes in the communicative beha 


* jsioD 
of the patient heavily influence the derasa 


terms used in psychiatry refer t i 
municative behavior of patients, and alt 
in fact, all psychopathology can be Me 
as a disturbance of communication. +0 
present study, therefore, the hypothesis 
adopted that mental illness requin oE a 
pitalization is characterized by a bre nee 
of communication, and that improver ca 
accompanied by more effective pee in 
tion of the patient with significant perso. tion 
his surroundings (5). Human ere 
can be examined only in a social context ae 
and a method has been developed, the In f 
personal Test which we have deria 
detail elsewhere (7), making it poss! 
measure the effects that communicaHorn ua- 
had upon the patient in a two-person A 3 
tion. By repeating the tests in the coni i 
several months, changes in the patient aa i 
of communicating likewise can be detec 


Method and Subjects F 
ed im- 


To test the relationship between rat ce 
provement and effectiveness of gonn ait 
tion, twenty-five psychiatric inpatients Hoe 
the relatives accompanying them to the 4 
pital were given the Interpersonal Tes 


was 
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the time of admission ê and again at the time 
of discharge. The Interpersonal Test consists 


of Q sorts (9) of cards on which are typed 


simple statements bearing upon actions, mo- 
tivations, intentions, and moods occurring in 
two-person situations. The statements were 
classified into the following categories: 


Statements Referring to Action 


1. Simple action statements of varied levels of ab- 
straction. 4 

Examples: “I argue with him,” “I reproach him.” 

2. Actions which denote intent. Example: “I try 


to reassure him.” ” . 
3. Actions which infer intent. Example: “He tries 


to reassure me.” 2 
4. Actions with built-in effect. Example: “He em- 


barrasses me.” z 
5. Action with inferred effect. Example: “I bore 


him.” 
Statements Referring to Feelings 
6. Subjective feelings. Example: “I like him.” 
7. Inferred feelings. Example: “He is ill at ease 
with me.” 


Statements Referring to Attitudes or Expectations 


8. Subjective attitudes. Example: “I respect him.” 
9. Inferred attitudes. Example: “He trusts me,” 


Interactive Statements Bearing upon 
Personality “Traits” 


10. Interpersonal trait variability. Example: sT 
have difficulty making decisions when with him.” 


The method is based on reciprocal sorts in 
which each person sorts the cards twice— 
once as applied to self and once as applied to 
the other person. The results are then paired 
in the following fashion: Statements sorted 
by Person A (what “I do when with him”) 
are paired with the corresponding statement 
of Person B (what “he does when with me”), 
in this instance the Patient “I-him” with the 
Relative “He-me.” Thus, the agreement and 
disagreement of two persons about each other 
can be compared in a standardized form. In- 
formation derived from the sortings describes 
functioning for any one occasion, and re- 
peated sortings reflect change over time. All 
patients who were accessible to the testing 
procedure on admission were seen.’ Fifteen 


2 Both patients and relatives were seen within one 
week of admission. 

3 Actually, many more than twenty-five patients 
and relatives were scen initially, but time restric- 
tions on the study made retesting of later admis- 
sions impossible. Results are reported for those pa- 


women and ten men participated in the study. 
Their mean age at the time of admission was 
33 years, with a range from 18 to 57 years. 
Mean length of hospitalization was five 
months, with a range from 2 to 114 months. 
Husbands of eleven of the woman patients 
were the participating relatives, and the other 
four, all of whom were unmarried, were tested 
regarding their mothers. In the group of ten 
men, seven wives and three mothers partici- 
pated. 

Among the patients were included cases 
with different diagnoses who received vari- 
ous therapeutic measures during their course 
of hospitalization. Sixteen of the patients 
were diagnosed as having some type of 
schizophrenic reaction, eight as depressive 
reaction, and one merely as “psychotic re- 
action.” Seventeen of the patients were given 
some kind of somatic therapy in addition to 
psychotherapy while the other eight had psy- 
chotherapy only. We were not concerned with 
the differential improvement rates for diag- 
nostic groups or treatment methods. Our 
focus was on the communicative changes 
which accompanied improvement, whatever 
the initial and final states and regardless of 
the kind of treatment or the chronological 
time that intervened. Patient and relative 
pairs could thus serve as their own controls, 
since previous research has demonstrated that 
the Interpersonal Test itself is reliable over 
time (1, 2, 4). 

An assumption of the study was that the 
relative participating in the admission pro- 
cedure was a significant person in the pa- 
tient’s interpersonal sphere. In every case, 
the patient had been living with the relative 
concerned up to the time of hospitalization. 
It seemed meaningful to ask these people 
“How do you act with each other? What is 
the nature of your relationship?” We might 
be interested, further, in the question of what 
it is about the relationship which results in 
the one person becoming the patient while 
the other remains out.of the hospital. 

Clinical indices of improvement were based 
upon rating scales designed for this study 
and filled out by a research psychiatrist from 
information in the clinical charts. The ratings 
were made for reported premorbid status, 


tients and relatives seen both on 
discharge. 


admission and at 
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condition at time of admission and again at 
time of discharge. A fourth assessment of 
status was also made approximately one year 
after discharge. The rating scales covered 
such areas of functioning as occupation, home 
and family, friends and community, physical 
and mental status, ward behavior, inferred 
attitude toward self, etc. Patients were rated 
as unimpaired, mildly impaired, or severely 
impaired for each of the various aspects of 
these areas. The psychiatrist assessing im- 
provement attempted to support all ratings 
with specific behavioral descriptions bearing 
upon the area involved. This procedure re- 
quired methodical study of the material in 
the charts and, while it was time-consuming 
and complicated, the resulting assessment of 
functioning probably has greater validity 
than simple over-all impressions. 


Results 


On the basis of these ratings at the time of 
discharge, the patients could be classified as 
improved, moderately improved, and unim- 
proved. Improved patients were those who 
were unimpaired or who had mild impair- 
ment in no more than two areas at the time 
of discharge. Eight of the patients fell in this 
category. Moderately improved patients were 
those who were judged to be still mildly im- 
paired in three or more areas but not severely 
impaired in any at the time of discharge. 
Nine patients were so classified. The remain- 
ing eight patients, called unimproved, were 
still severely impaired in one or more areas 
at the time of discharge in addition to mild 
impairment in others. The ratings at time of 
discharge only were used since they corre- 
lated — .90 with improvement measured by 
the change in ratings from admission to dis- 
charge. 

Analysis of Interpersonal Test correla- 
tions‘ supported the hypothesis that im- 
proved patients would agree better with their 
relatives about the nature of their interrela- 
tionship than unimproved patients. Correla- 
tions for each half of the test (Pt. “I-him” 
with Rel. “He-me” and Pt. “He-me” with 
Rel. “I-him”) were computed for each test- 
ing (at admission and at discharge). From 
these four measures of agreement, three 


1 *We are grateful to Mrs. Sarah Dean for her par- 
ticipation in the statistical analysis of the data. 


measures of change or shift in agreement 
were derived from the data. 

These measures were the “I-him Shift,” 
the “He-me Shift,” and the sum of these two, 
called the “Total Shift.” They were computed 
as follows: 


Correlations were converted to z scores 
(3) and the absolute change in z score from 
Time I to Time II (admission to discharge) 
was calculated. For example, Patient L. had 
the following correlations with his wife: 


Pt. “I-him” with Rel. “He-me” 


A B 
Time I Time II 
r zZ r Z 
36 38 65 78 
Pt. “He-me” with Rel. “I-him” 
& D 
Time I Time II 
r zZ r a 
44 47 62 73 
i = 40 
T-him Shift = B—A ia z 
He-me Shift = D— a 66 
Total Shift = (B + D) — (A +C) = 


JE 
Means of the three Shift scores were calc 


. S. 
lated separately for the three patient gronn 
Table 1 shows the number of patients 1n 


Table 1 


ù al 
Shifts in Agreement with Relatives on Interperso™ 
Test for Patients Rated According to 
Degree of Improvement 


T-him Shift 
7 
High agreement 0 5 1 
Low agreement 8 4 tie 
Unim- Moderately ved 
proved Improved pro 
He-me Shift 
6 
High agreement 1 5 2 
Low agreement 7 4 im 
Unim- Moderately oved 
proved Improved P" 
Total Shift 
6 
High agreement al 5 2 
Low agreement 7 4 Im- 
= Unim- Moderately ved 
proved Improved p° 


ee 


Mean Differences on Shifts in Agreement for Patients 
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Table 2 


Rated According to Degree of Improvement 


Agreement 
Patient Total Lhim He-me 
Groups Shift Shift Shift 
Improved +49 +35 +14 
Moderately j 
Improved +30 +17 +13 
Unimproved —25 ~16 -9 


Significance of Differences 


d p d p d p 
Improved vs 
Mod. Imp. 19 NS 18 NS 1 NS 
Improved vs 
Unimproved 74 <01 51 <001 23 <10 
Mod. Imp. vs : 
Unimproved 55 <02 33 <05 22 <1 


rated group falling above and below the me- 
dian of the total sample. On all shift meas- 
ures both improved groups differed from the 
unimproved group but not from each other, 
as shown in Table 2. 


Improved Group 


Agreement with the relative increased an 
average of 49 points on Total Shift for the 
improved group. This increase arose more 
from the I-him Shift than from the He-me 
Shift. Six of the patients in the group fell 
above the median and two below on both 
Total Shift and He-me Shift, while seven 
were above and one below on I-him Shift. 
The improved group showed greater in- 
creased agreement with their relatives than 
did the unimproved group.* There were con- 
sistent mean differences between the im- 
proved and moderately improved groups on 
all three measures, but these differences were 
not statistically significant. 


Moderately Improved Group 


Agreement with relatives increased an av- 
erage of 30 points on Total Shift for the mod- 
erately improved group. These patients also 
differed on all three Shift measures from the 
unimproved group, but less markedly than 
the improved group. On all three measures, 


5 Based on ¢ test for difference between means. 


five of the group were above the median and 
four below. 


Unimproved Group 


Agreement with relatives decreased 25 points 
from admission to discharge for the unim- 
proved group. As reported above, these dif- 
ferences separated the group from both im- 
proved groups on all three measures. On the 
I-him Shift all eight of the unimproved pa- 
tients fell below the median of the total sam- 
ple. Seven of the eight were below the me- 
dian on both the Total Shift and the He-me 
Shift. 

It is apparent that increased agreement 
with relatives on the Interpersonal Test is 
closely related to clinically rated improve- 
ment from psychiatric illness. 

Other characteristics of the sample, namely 
sex, diagnosis, type of treatment, and length 
of hospitalization were examined for their in- 
fluence on the results. Length of hospitaliza- 
tion was the only one of the four showing a 
relationship; the unimproved patients were 
hospitalized longer, on the average, than the 
patients rated as only moderately improved. 
Since the improved group fell between these 
two on length of hospitalization, it was ap- 
parent that the difference did not account for 
the test findings. 


Discussion 


It is possible to examine the individual sets 
of Q sorts for the sources of agreement and 
disagreement between patient and relative. 
Items of disagreement vary from pair to pair, 
however, and it would seem that the critical 
factor involved is the failure or inability of 
unimproved patients to communicate effec- 
tively or, as Ruesch (8) has put it, “to cor- 
rect their information according to feedback.” 
The use of a close relative to measure a pa- 
tient’s communicative effectiveness is a severe 
test, since improved communication might 
occur with others in the environment but 
not with the spouse or parent. These relatives 
are frequently the focus of long, deeply 
entrenched communicative distortions which 
ultimately culminate in severe disturbance 
and are not readily corrected. Block (1), 
Block and Bennett (2), and Kalis (4) have 
demonstrated that people assume different 
roles at the interpersonal level, so that the 


14 


relationship with the relative tested reflects 
the effectiveness of only one such role. 
Ideally, corresponding use of the test with 
therapists and others in close association with 
the patient would provide a more stable index 
of that patient’s increased communicative ef- 
fectiveness. The simplicity of the design that 
was used, involving only the relative, serves 
to emphasize the sensitivity of the test for 
measuring improvement, and suggests other 


possible uses of the test in a clinical setting.’ 


Discrepancies between clinical ratings of 
improvement and agreement with others on 
the Interpersonal Test would be of particu- 
lar interest. If the test reflects improvement 
while staff opinions disagree, or vice versa, 
exploration of possible sources of the dis- 
agreement could lead to a better understand- 
ing of the patient, his illness, and the nature 
of his interactions with others. Use of the test 
with staff members might also lead to better 
understanding of staff interactions with the 
patient. 

Attempts to relate the test findings to 
Status of patients one year after hospitaliza- 
tion were not illuminative. Neither communi- 
Cative ability at discharge nor clinical ratings 
were predictive of a patient’s future adjust- 
ment. A retrospective analysis of test differ- 
ences between patients who remained symp- 
tom-free and those who were rehospitalized 
might suggest some predictive indices. Tests 
at the time a patient leaves the hospital 
cannot anticipate the difficulties he will en- 
counter in the future, however, and it seems 
sufficient at present to evaluate a person’s 
effectiveness at a given time. Neither does 
the Interpersonal Test predict which persons 
entering a psychiatric hospital will improve 
and which will not. It merely measures some 
of the communicative correlates of improve- 
ment, when such improvement has occurred. 


Summary 


This study was designed to measure changes 
in communicative beHavior of psychiatric pa- 
tients during hospitalization and to relate 
such changes to independently derived in- 
dices of clinical improvement. 

Twenty-five hospitalized psychiatric pa- 
tients and the relative accompanying each to 
the hospital were exaniined at the time of 
the patient’s admission and again at the time 
of discharge. Each pair was given the “In- 
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terpersonal Test” which consists of recipro- 
cal Q sorts bearing upon the nature of the 
relationship as perceived by the two partici- 
pants and reflecting their respective agree 
ment and disagreement. 


Clinical improvement was assessed with 


the use of ratings pertaining to various areas 
of the patient’s functioning. These ratings 
made by a psychiatrist independently of thg 
Q sorts, were applied to four different pert 
ods of the patient's illness: premorbid status, 


. . x imê 
time of admission, time of discharge, and timè 


r 
of follow-up approximately one year afte 


discharge. 

The results indicate that significant A 
ferences exist between patients rated | E 
proved” and those rated “unimproved. pei 
“improved” groups showed better agreer 
between patient and relative following 
italization. ; 
ý This investigation demonstrates that m 
urement of mutual agreement repress ae 
valid technique for assessing clinical mp pt 
ment; it further shows what is calles age 
provement is in part based upon obser 
of changes in communicative behavior. 


Received April 10, 1956. 
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Maze Test Reactions After Chlorpromazine’ 


S. D. Porteus 


Territorial Hospital, Kaneohe, Hawaii 


In a current investigation of the effects of 
chlorpromazine? 50 male patient with psy- 
choses of various types were given 300 mg. 
of the drug daily for four months and their 
ward behavior rated at three-week intervals 
by specially devised rating scales of eleven 
traits or trait complexes. All were inmates of 
closed wards, with hospital residence ranging 
from 3 to 57 years. Other therapies such as 
insulin coma and electric shock had been used 
without lasting benefit. Patients can there- 
fore be considered typical chronic peychotics 
such as can be found in the “back wards” of 
any state mental hospital. ‘ è 

With the approval of the medical director, 
Dr. Robert Kimmich, the “double-blind” ap- 
proach was adopted, the same number of pa- 

nts on another closed ward serving as con- 
trols and receiving placebos, all medication 
being closely supervised by Dr. John Regan 
of the psychiatric staff. The whole design was, 

vit believe, a good example of hospital team- 
work, the psychologist assuming direct re- 

. sponsibility for the research, the psychiatrist 
for treatment. + s 

The therapeutic results of chlorpromazine 
will be reported fully in another article. A late 
development in the study, namely, changes in 
Maze Test scores, seemed to be of such im- 
portance from both the psychological and 
psychiatric points of view that it was deter- 
mined to give it priority of publication. With 
regard to the behavior ratings, it will be suff- 
cient to report here that 63% of the chlor- 
promazine patients showed significant or 
marked improvement. On the other hand, 


1Study supported in part by a grant from the 
James McKeen Cattell Fund, New York. 

2 Smith, Kline and French generously supplied free 
of cost all the chlorpromazine necessary for this re- 
search. 


only 11% of the placebo group showed im- 
provement. Thus, it may be stated that, after 
allowing for suggestibility of both raters and 
patients, over 50% of chronic male psychotics 
benefit to a marked degree by continued 
medication with chlorpromazine. 


Maze Test Applications 


The other development arose through what 
was originally regarded as a minor phase of 
the research design, namely, the application 
of the Porteus Maze Tests before and after 
the use of the drug. Unfortunately, only 13 
patients of 50 were found who were amenable 
to testing or whose Maze results had any 
meaning. These cases were tested by T. 
Greenland, a graduate psychology student, 
under the writers immediate supervision. 
After four month’s medication, these patients 
were retested by the writer, using the prac- 
tice-free extension series of Mazes (10). 
Later, two male and seven female cases were 
added to the group, two of the women having 
had larger dosages for a shorter time, while 
five had had the same dosage but for a pe- 
riod of six weeks instead of four months. 

On the basis of a study by Peters and Jones 
(9) who had found that social or ward be- 
havior improvement after psychodrama-ther- 
apy was reflected in significant gains on the 
Porteus Maze, there was every expectation 
that improvement would be shown by simi- 
larly improved chlorpromazine patients. 

The only study in which Maze results are 
reported seems to be one by Gardner, Haw- 
kins, Judah, and Murphree (5). They stated 
that after chlorpromazine four out of nine 
patients improved in Maze scores as against 
a similar result in eight out of ten reserpine 
patients, while placebo patients declined. 
However, apparently only the standard Maze 
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series was used both before and after medi- 
cation, and thus an indeterminate amount of 
improvement must be attributed to practice. 


Decline in Maze Scores 


Considering that Peters and Jones reported 
a gain of 2.5 years in Maze Test age as 
against an average gain of only 0.8 year for 
their control group, the writer was very much 
surprised to find a reverse trend for chlor- 
promazine patients who were also socially im- 
proved. Instead of a gain, there was a net 
loss (algebraic sum of gains and losses di- 
vided by W) of 2.06 years. The percentage 
of patients losing in score was 68.2%, 3 or 
13.2% remained the same, and 4 or 18.1% 
gained. The percentage who gained in the 
study by Gardner et al., was 44%, but it is 
impossible to state how this figure was af- 
fected by practice. 


Comparison With Lobotomy 


The fact that a similar impairment in Maze 
Test performance has been found in every 
important study of the psychometric results 
of psychosurgery makes desirable a review 
of these investigations. The suggestion that 
chlorpromazine acts as a chemical or phar- 
macological lobotomy is, of course, not new. 


In an intensive review of the subject by 
Dundee: 


The mental effects of prolonged administration of 
chlorpromazine have been called “pharmacological 
frontal lobotomy” (Terzain, 1952). Patients show a 
lack of spontaneous interest in their surroundings, 
are generally immobile, and at first glance seem to 
be heavily drugged. However, the higher psychic 
functions are preserved to a remarkable degree and 


patients are capable of sustained attention and con- 
centration (3, p. 362). 


Other references of the same kind are nu- 
merous and seem to be based mainly on some 
similarities of effect. Both lobotomy and 
chlorpromazine are used for the relief of in- 
tractable pain; both diminish anxiety or self- 
concern; both seem to result in improved ap- 
petite and sudden increase in body weight. 
If in addition it can be shown that the Maze 
reactions are similar in psychosurgery and 
chlorpromazine therapy, thus providing a 
comparison of measurable mental effects, the 


analogy would be more complete and mean- 
ingful. 


$: 
The Maze Test and Psychosurgery 


Probably no apology is needed for reguh 
ing Maze Test results after various forms 0 
frontal lobe operations. Neurosurgery 1S usu- 
ally outside the sphere of psychologists’ inter- 
ests. Many of them appear oblivious to the 
relevancy to psychometry of psychosurg! 
findings, even though psychologists, such a 
Landis and his co-workers, have played a vita 
role in evaluation of the mental effects © 
these operations. This lack of interest Pe 
astonishing considering their bearing on t 
meaning and validity of objective tests n 
how the new approaches to psychiatry opa 
up whole fields of usefulness for the psy¢ 0 
logical members of the hospital team. oa 

My own interest in the matter dates fr a 
1942 when a happy conjunction of a neur ; 
surgeon, a psychiatrist, and a clinical Pa 
chologist resulted in a study that aE a 
for the first time marked impairment in Meg 
tested abilities after lobotomy. Porteus aa 
Kepner (12) reported these mental gba 
in 1944. The study was continued by Por a 
and Peters, with publication of another ds 
graph (13) dealing specifically with the ra a 
dramatic validation of the Maze as “a fron 
lobe brain test.” This study confirmed ki 
previous results. Porteus and Kepner ha gá 
ported a net loss after lobotomy of sa 
years, the decline in score occurring in 76. és 
of the cases. In the second larger pel ws 
cases) the net loss was 1.65 years an ae 
fected 81% of the group at the first or i 
later postoperative testing. A control gf i 
of 55 criminals (unoperated) made a gan 
1.6 years on their second application ee 
Maze. Porteus and Peters pointed out ot 
social recovery seemed to be closely as j 
ciated with a pattern based on repeated be 
plications of the Maze, namely, a mn 
initial postoperative decline in score, follo of 
by steady successive increments of score 
to and beyond the preoperative level. 


a in 

Consideration of these findings cea 

the inclusion of the Porteus Maze in ae 

tery of 35 tests applied in the eee 

Greystone investigation on the effects a 

topectomy (gyrectomy). The study ee he 
ported most adequately in 1949 (1). 


Maze Test Reactions 


kj 

operated group, as a whole, lost 1.21 years 
on Maze Test age. The pattern of response 
described by Porteus and Peters as asso- 
ciated with social improvement was noted by 
H. E. King (6) who reported the psycho- 
metric changes as “a marked relation be- 
tween performance in this test and social im- 
provement.” Five out of six patients selected 
by psychiatrists as showing the greatest im- 
provement exhibited the characteristic pat- 
tern of an immediate postoperative loss of a 
year or more in score, followed by a regam 
in performance up to or above the preopera- 
tive level. The successive average score of pa- 
tients discharged from the hospital within a 
year were 13.5, 11.4, 13.2, 14.6, and 15 years. 
Thus it was clear that an initial loss in Maze- 
tested functions was characteristic of pa- 
tients who suffered excisions of cerebral cortex 
in various frontal areas, particularly areas 8, 
9, 10, 46, with practically no deficits in re- 
gard to area 11 in the orbital region. 

The second Columbia-Greystone project in- 
volved a variety of surgical insults to the 
frontal lobes, including two types of venous 
ligation, anterior and posterior, thalamotomy, 
thermocoagulation of portions of areas 9 and 
10. and transorbital lobotomies. The average 
loss in Maze Test age for all cases was 1.5 
years postoperatively, but ranged from 0.8 
year in the transorbital lobotomy group to 
4.7 years in the thalamotomies. The loss in 
the more posterior venous ligation patients 
was much more severe than for the more an- 
terior operation. The results were reported in 
book form in 1952 (2). 

The third study of maze scores was de- 
scribed by Sheer (15) who, like King, worked 
under Landis. This investigation concerned 
orbital and superior areas of the frontal cor- 
tex. Again the posterior-superior situs resulted 
in the most serious Maze deficits, with less 
apparent practice improvement for successive 
applications than could be observed in the 
control group. The orbital group lost only 
1.13 years in test age, the superior group 2.88 
years. The Wechsler-Bellevue showed a loss 
of 2 IQ points, but if the Maze results are 
expressed similarly, the Maze Test loss was 
16 points. 

AJl three Columbia-Greystone project find- 
ings were summarized by Landis at the third 


After Chlorpromazine 17 
Psychosurgical Conference in New York in 
1951: 

In the battery of tests which was used in the first 
Greystone, the second Greystone, and the New York 
State Project, we included both the standard Wech- 
sler-Bellevue and the Porteus Maze Test. In the test- 
by-test analysis of the results which were obtained, 
the only intelligence test which showed a uniform or 
almost uniform loss during the first month after op- 
eration compared to the preoperative performance 
on this battery of tests was the Porteus Maze Test. 
Dr. Porteus had previously reported this sort of loss 
after lobotomy. We confirmed his finding that a 
brain operation performed on the frontal lobes gives 
rise to an immediate postoperative loss in mental 
age of 1 to 2 years in some 80 percent of psycho- 
surgery patients (7, p. 109). 

In all probability, the average loss would 
have been much greater if a practice-free 
form of the test had been available for post- 
operative examinations. The question of per- 
manency of the defects in Maze-tested abili- 
ties could not be settled by the Greystone- 
Columbia studies, since no definite allowance 
could be made for the effect of practice. That 
it was an important factor was shown by 
Sheer (15) who gave the Maze twice before 
operation. The control groups increased 1.60 
score points before operation, and the pa- 
tients to be operated, 1.79 points. Thus the 
postoperative testing became the third appli- 
cation of the Maze, and what the practice 
effects would amount to under those condi- 
tions, no one knows. Again it is necessary to 
point out that these differences are expressed 
in years of test age whereas results for the 
Wechsler-Bellevue are all reported in IQ 
points. 


Robinson Study 


The only evidence so far presented that 
bears on the problem of permanency of Maze 
Test deficits has been supplied by Robinson 
(14) who tested 68 of the Freeman-Watts 
patients, the elapsed time since lobotomy be- 
ing over three years. This was the first ap- 
plication of the Maze, so that no practice ef- 
fects were involved. She used as controls 12 
patients who had been discharged from the 
hospital without benefit of lobotomy. In a 
private communication, Dr. Robinson ex- 
pressed the opinion that 50 percent of the 
lobotomy cases could get along in the com- 


3 Letter dated Sept. 10, 1951. 
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munity “without control or supervision.” On 
a rough scale of social participation the cor- 
relation with Maze scores was .42. Consider- 
ing all the variables involved, this was as high 
as I should expect. 

The Maze score of the controls as reported 
by Robinson in the second edition of the 
Freeman-Watts book (4) was 13.9 years, 
10.6 years for the lobotomy patients, a de- 
ficiency of 3.3 years. Expressed as IQs for 
adults the figures would be 98 and 75 re- 
spectively. Ten patients who suffered “radi- 
cal” (more posterior) lobotomies tested 9.6 
years, IQ 68. It may be of interest to note 
that the average Maze Test age of these pa- 
tients was below that of Australian aborigines 
tested by the writer. All the evidence avail- 
able shows that the more anterior the frontal 
lobe injury the less the Maze deficits. Trans- 


orbital lobotomy causes a much less signifi- 
cant loss, 


Comparison with Chlorpromazine Findings 


From Sheer’s table of scores of 36 cases 
used in the New York State Project, I have 
calculated their average postoperative test 
age to be 11.12 years; the Porteus-Kepner 
(N = 17) age was 9.41 years, while the 
chlorpromazine group scored 9.5 years. It 
should be noted that, in the first named 
study, two adult tests were used, raising the 
maximum score two years. For purposes of 
ready comparison I have calculated the losses 
following frontal lobe operations and have 
listed the chlorpromazine cases thereafter. It 
is probable that if the practice-free extension 


Table 1 


Loss on Maze Test in Several Studies of Frontal Lobe 
Operations, and in Chlorpromazine 


Study Result 


Columbia-Greystone I 
Columbia-Greystone IL 
New York Brain Study 
Orbital cases 
Superior topectomy 
Robinson study (N 18) 
Porteus-Kepner (WV 17) 


1.2 years postoperative loss 
1.5 years postoperative loss 


1.13 years postoperative loss 
2.88 years postoperative loss 
3.3 years below controls 

1.91 years postoperative loss 


Porteus-Peters (N 16) 1.81 years postoperative loss 
hlorpromazine group 2.06 years postmedication loss 
(N 22) 


Stanley D. Porteus 


series had been used for the retesting of psy- 
chosurgical cases their deficits would have 
been larger postoperatively. The general situa- 
tion as regards Maze Test impairment in the 
various investigations appears in Table 1. 


Personality Reactions in the Maze 


In discussing the reactions of psychosurgery 
patients, Landis and Erlick (8) describes 
them in these terms: “It is as though hea 
were a decrease in the vigilance or walt 
ness of the patient, the changes being wa 
varied as those which might occur in any ‘le 
dividual that was drowsy or sleepy es, 
taking a test.” It should be said that whi 


x ae 
drowsiness is typical of chlorpromazine P% — 


tients in the early stages of medication, s 
did not appear to be present at the tima 
testing. Nevertheless, there were very en a 
losses in foresight. Some specific reaction ait 
wards mistakes in the Maze threading T 
worth noting. Though concern was ie f 
voiced, the examiner felt that it was ee 
superficial. One or two patients we ie 
apologetic for repeated unsuccessfu vat 
stating that they were sorry to waste 50 ont 
of the examiner’s paper. Others would e 
ment on their own mental dullness but show f 
little or no real concern. As in the casé 
lobotomy patients there was a tendena 3 
give up, or relax vigilance when the te cote 
came difficult, even when the utmos jowet 
and prudence had been used in the a4 
level tests. It was as if the patient sal 1 
have done all I can. The rest does ios That 
ter.” This attitude was reminiscent 0 iol 
when the task was beyond them, 
their simian shoulders and went on 
something else. 2 

canes} there is no typical or general T al 
action to the Maze of either a pagbee ae 
or chlorpromazine patients but a grena the 
riety of attitudes. Some are nonchalant 19 ©, 
face of failure, some are cautious, consc. ea 
ous workers. Some appear to think a 8 pit 
deal depends upon their success or failure, f 
most seem to enjoy the experience; omi Io 
apologetic, others seem emotionally flat. a 
comparison with normals the reactions O ode 
tients seem more positively notable. In a W 
they are “different.” 


jenti- 
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Table 2 


Maze Scores Before and After 


Lobotomy and Chlorpromazine 


Lob. Porteus—Kepner Lob. Porteus—Peters Chlorpromazine A 
$ Pre- Post- 

Fa nie ee we “a e Dif. Med. Med. Dif. 
oO. D. e S 

F 8.5 10.5 +2.0 6.0 10.5 +45 
1 eae is 105 ts {10 60 65 +05 
2 145 3 0 145 15.0 +05 140 145 +05 
3 ie 0. 110 110 0. 160 165 +05 
s a5 EaR OS EDS b fey Uae a 
> aT aR 10.0 95  —0.5 TS 0; 

: $ “10 70 65 —05 16.5 5 : 
7 12S N 100 S0 1.0 65 60 015 
8 5.0 ns fie 8.5 75 10 13.5 130 0:5 
9 is oe 20 90 3.0 80 7) 75h meeps 
10 13.0 eh Be 95 WOO) 235 120 95 235 
1 ae eee 110 70 —40 15:5, Gs = 
12 14.0 A tari 14.0 95 4.5 150 ans -3.5 
13 tee a a ESO 140 95 4.5 15.0 10.0  =50 
14 mae Me a BAe 13.0 85 —45 1400 75 65 
foe we a 
17 us 85 m50 11.0% -gs aes 
18 13.5% 85 E E 
19 15.5 1i 10.0 E sS 
20 150* 90 60 
21 15.5% aoe Meio 
22 


Loss 1.91 Years 
76.5% Cases 


Loss 2.08 Years 
68.2% Cases 


Loss 1.81 Years 
68.75% Cases 


* Female patients. 
Commonly Observed Tendencies 


Only two specific tendencies seem to the 
writer to be more general with psychosurgi- 
cal and chlorpromazine patients on the Eu 
hand, than with normals on the other. x 
first seems to be a tendency to come to a su : 
den full stop when the more complex yek o 
the test are reached. This was called “shatter 
effect” by Brundage * but our groups, espe- 
cially chlorpromazine cases, are at present 
too small to allow definite comparisons with 
normals. 

The other trend seems to be more fre- 
quently observable and more susceptible to 
analysis. This is the tendency to repeat the 
same errors, that is, enter the same blind 
alley more than once. Naturally, this cannot 
occur except when there is failure in two 
trials in tests up to and including year XI. 
In tests XII, XIV, where four trials are al- 


4In a personal communication, Dec. 26, 1943. 


lowed and in Adult I (three trials), the tend- 
ency to repeat errors can be more readily 
detected. 

In the Porteus-Peters study, 20.8% of the 
patients had two or more repeated errors in 
their prelobotomy tests, 37.1% in their first 
postlobotomy Maze examination. The chlor- 
promazine cases also showed 20% with two 
or more repeated errors before medication, 
45% after medication. The average number 
of all repeated errors was 1.1 before medi- 
cation, 1.55 after. Thus a tendency to repeat 
errors seems to be characteristic of psychotic 
patients but is accentuated after both lob- 
otomy and chlorpromazine. 


Individual Scores 
To give a better comparative view, I have 
shown in Table 2 the individual Maze scores 
of these groups, 17 cases of the Porteus- 
Kepner study, 16 “much improved” cases of 
the Porteus-Peters- monograph, and the 22 
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chlorpromazine patients. In the last named 
group, the female patients’ scores are given 
after the males. The number (seven) is small 
but the results indicate that all lost ground 
on the Maze and the deficits were greater 
than for males. This trend calls for further 
investigation. Again it should be stressed that 
had the practice-free form been used in the 
first two studies, the Maze impairment would 
undoubtedly have been greater. 

Original data are given in the table for the 
purpose of showing that the decline in Maze 
scores occurs at all levels and has a wide 
range of variation. In other words, the effect 
of chlorpromazine, like that of psychosurgery, 
differs very greatly in different individuals, 
illustrating the tremendous complexity of the 
neurological bases of human behavior. 


Neurological and Practical Bearings 


Questions as to where in the central nerv- 
ous system chlorpromazine exerts its influence 
are only relevant to the purpose of this paper 
insofar as the data cast some light also on the 
localization of Maze-tested functions. In Dun- 
dee’s review (3) he states: “Terzain (1952) 
concludes that the effects of chlorpromazine 
Tepresent depression of the reticular forma- 
tion, particularly the sensory and autonomic 
spheres. . . . From an analysis of the effects 
of chlorpromazine on the central and auto- 
nomic systems, Aron (1954) has come to the 
conclusion that its site of action is probably 
in the hypothalamus. . , .” Since lobotomy 
interferes with the thalamocortical connec- 
tions, this extends the parallel with chlor- 
promazine. Topectomy of areas 9 and 10 also 
causes loss of anxiety and vigilance as meas- 
ured by the Maze, and quite conceivably, the 
mental alertness necessary in Maze threading 
is mediated by the reticular formation. Other 
brain areas are probably involved in Maze 
performance, this matter having recently been 
discussed by the writer (11). The chlorpro- 
mazine and lobotomy patients are, as re- 
gards Maze performance, literally “asleep at 
the switch.” In other terms, both treatments 
interfere with capacity for prehearsal. 

The bearing of these findings on psycho- 
pathology is clear and significant, but they 
have no bearing on the legitimate use of 
chlorpromazine, especially. in mental hospital 


practice. If the patient is severely or com- 
pletely disabled by psychosis and can be sig- 
nificantly improved by chlorpromazine, the 
occurrence of some deficits in planning and 
mental alertness will surely not deter the 
psychiatrist from administering the drug. The 
drawbacks would be far outweighed by the 
advantages. As regards the benefit to hos- 
pitalized psychotics, the drug seems to usher 
in a new era in treatment. But if permanent 
deficits follow, then the use of the drug in 
minor mental ailments or with emotionally 
disturbed children would certainly be contra- 
indicated. It was, no doubt, such casual ap- 
plications which the Research Committee of 
the American Psychiatric Association had in 
mind when it issued its recent warning that 
the indiscriminate use of the tranquillizing 
drugs constituted a public danger. 


Further Research 


In relation to our own study, two steps in 
further research are obvious, in addition to 
confirmation of present findings with larger 
groups. The first is to discover the relation of 
social improvement to Maze Test patients of 
response as an aid to patient selection. The 
second and more important is to find out 
whether the deficits are permanent or transi- 
tory. It is quite possible that on this point the 
similarity of pharmacological lobotomy to 
psychosurgery may break down. It may well 
prove to be the case that unlike psychosur- 
gery, the effects are reversible. But only care- 
ful and prolonged research will provide an- 
swers to this most important question. We 
have already suggested that the differences in 
male and female Maze reactions after chlor- 
promazine should be further investigated. Ex- 
tension of our study to the effects of others 
of the so-called “ataractic” drugs is obviously 
desirable. Samples of the behavior scales and 
tests will be supplied on request. 


Summary 


In connection with a study of behavior 
changes following long continued use of 
chlorpromazine with chronic psychotic pa- 
tients, the Porteus Maze Test was applied to 
13 males who were accessible to testing and 
whose scores showed any significance. After 


we 


J 
å 
à 


Maze Test Reactions After Chlorpromazine 


four months’ medication the test was reap- 
plied. 

In spite of measured social improvement, 
there was a marked drop in scores. The num- 
ber of cases was then augmented by nine, 
seven of whom were female patients who had 
taken an equal dosage of chlorpromazine for 
six weeks. Analysis of postmedication results 
for the whole group of 22 cases revealed an 
average Maze Test deficit of 2.06 years, af- 
fecting over 68% of patients. 

This evidence that chlorpromazine acts as 
“a pharmacological lobotomy” is summarized 
by means of a review of all important psycho- 
surgical studies in which the Porteus Maze 
was used, and the deficits found are com- 
pared with those now demonstrated to follow 
prolonged chlorpromazine medication. 

The point is emphasized that any parallel 
with psychosurgery has little bearing on the 
use of the drug with hospital mental patients. 
It does, however, serve as a strong contra- 
indication towards its indiscriminate use for 
lesser mental disorders, and with children. 
Other implications are briefly discussed and 
future steps in research on the subject are 
indicated. 


Received October 15, 1956. 
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A Cross Validation of Starer’s Test of 
Cultural Symbolism’ 


William 


D. Winter 


San Jose State College 


and James 


W. Prescott 


Marquette University 


In a recent study (1), Starer described a 
technique for investigating whether or not 
subjects (Ss) tend to classify elongated, 
pointed, or angular designs as male symbols 
and round, curved, or containing designs as 
‘female symbols. In his procedure the S is 
asked to match ten designs, half of which are 
male and half female according to the above 
definitions, with ten first names, five male and 
five female. A correct matching consists of 
the S’s placing together a name and a de- 
sign of the same sex. In his original investiga- 
tion, Starer found that psychotic and normal 
Ss correctly matched the designs and names 
at a level significantly greater than expected 
by chance, suggesting the existence of a gen- 
erally accepted sexual symbolism in this cul- 
ture. It was Starer’s impression that the in- 
ability to match symbols correctly is related 
to psychotic confusion. 

In an effort to repeat Starer’s study and to 
obtain evidence on this latter observation, 52 
male and 55 female hospitalized mental pa- 


+ An extended report of this study may be ob- 
tained without charge from William D. Winter, 
Psychology Department, San Jose State College, 
California, or for a fee from the American Docu- 
mentation Institute. Order Document No. 5087 from 
ADI Auxiliary Publications Project, Photoduplica- 
tion Service, Library of Congress, Washington 25, 
D. C., remitting in advance $1.25 for microfilm or 
$1.25 for photocopies. Make check payable to Chief, 
Photoduplication Service, Library of Congress. 


tients were individually tested using Starer’s 
technique, and were also given the group form 
of the MMPI. The distributions of number of 
correct matchings for both male and female 
patients significantly differed from chance ex- 
pectations (p < .001), and the obtained fre- 
quencies were similar to those reported by 
Starer. 

However, the variable of number of correct 
matches was not significantly related to any 
of the commonly used MMPI scales, includ- 
ing those usually associated with psychotic 
thinking, such as F and Sc. This finding, to- 
gether with the fact that there is no signifi- 
cant difference between our female psychotic 
patients and Starer’s normal female Ss in the 
number of correct matchings, fails to sub- 
stantiate Starer’s clinical impression that in- 
correct matches are related to psychosis. This 
lack of relationship with the MMPI leaves 
us in doubt as to whether the ability to 
match sexual symbols correctly is related to 
specific personality factors or is simply a re- 
flection of the membership of an individual 


in this society. 
Brief Report. 
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Correlation of a Modified Form of Raven's Progressive 
Matrices (1938) with the Wechsler 
Adult Intelligence Scale’ 


Julia C. Hall 


Veterans Administration Hospital, Bronx, New York 


Among brain-damaged patients motor im- 
pairment is often so severe that the perform- 
ance tests of the Wechsler Adult Intelligence 
Scale (WAIS) are inappropriate as measures 
of intellectual function. Since research find- 
ings (6) indicate that the performance tests 
are more apt to reflect the effects of brain 
damage, it cannot be assumed that the WAIS 
Verbal Scale gives a comprehensive survey of 
the intellectual function of the brain-damaged 
individual and substitutes for the performance 
tests would, therefore, be very useful, Raven’s 
Progressive Matrices (7, 9) is one potentially 
useful substitute because performance on the 
test is not affected by motor impairment and 
some of its characteristics suggest that it is 
apt to be a more sensitive indicator of im- 
paired function than are verbal tests. The 
relevant characteristics are as follows: (a) 
Matrices is a relatively univocal test of rea- 
soning (12), and ability to reason is thought 
to be highly susceptible to brain damage; (b) 
Matrices and the WAIS performance tests 
have similar age decline curves (8); (c) re- 
ported correlations between Matrices and the 
Block Design test of the Wechsler-Bellevue 
are considerably higher than are correlations 
with the tests of the Verbal Scale (1, 4). The 
investigation reported here studied the cor- 
relation between the WAIS and a modified 
form of the Progressive Matrices (1938). 


1 or wishes to express her appreciation to 
the at ap trainees of the Clinical Psychology Sec- 
tion, Bronx VA Hospital, for their generous Sas lea 
tion in giving the tests on which this study is base 
and to Drs. H. L. Flowers, Chief, Neuropsychiatric 
Service, and R. S. Morrow, Chief, Clinical Psychol- 
ogy Section, for their sustained support. 
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Procedure 


In its standard form Matrices is an un- 
timed, 60-item test which usually takes 40-50 
minutes. Since in the clinical situation 50 
minutes devoted to so homogeneous a test as 
Matrices is excessive, a 30-item form, using 
an odd-even method of selection, was made 
up and a 20-minute time limit imposed. Some 
deviations from odd-even selection were made 
in order to secure items less demanding of 
visual acuity. The specific items used are 
shown in Table 3. Since a highly speeded test 
was not desired, the time limit chosen was 
one which pilot-study findings indicated was 
sufficient for most individuals to complete 
the test. 

The subjects of the study are all those pa- 
tients tested on the Neuropsychiatric Service 
during the period from April to December, 
1955, who passed the research exclusion cri- 
teria. The exclusion criteria were: (a) not 
psychotic, (b) no evidence of brain damage, 
(c) if Negro, not educated in a southern 
state, and (d) not educated in a foreign coun- 
try. The brain-damage criterion was inter- 
preted stringently; that is, even in the ab- 
sence of neurological symptoms, either a 
history of head trauma, or of shock therapy, 
or an abnormal electroencephalogram was 
sufficient for exclusion. Eighty-two individu- 
als (all males) fulfilled the research criteria. 


Results 


The data in Table 1 indicate that the sam- 
ple is fairly heterogeneous with regard to age, 
education, and IQ. The shape of the Full 
Scale IQ distribution closely approximates 
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Table 1 


Range, Mean, and Standard Deviation of Age, 
Education, and WAIS IQ 


Variable Range Mean SD 

Age (to nearest year) 19-49 31.71 7.29 
Education 7-17 11.67 2.48 
IQ: Verbal 89-139 110.81 13.45 
Performance 80-134 102.45 13.18 

Full Scale 82-138 108.67 12.52 


that of a normal curve (x? of 1.846 with 
5 df, 90 >p > .80). 

Despite the fact that the mean WAIS Full 
Scale IQ of this sample (108.67) is fairly 
close to the hypothetical population mean 
(100.00), the Matrices scores tend to cluster 
close to the ceiling. The interval between the 
mean and the highest Possible score is only 
nine points and 63% of all scores lie in this 
interval. Chi-square analysis of the shape of 
the distribution showed significant (p < .05) 
departure from both normality and symmetry. 

Because of the effects of departures from 
symmetry on the correlation Coefficient, one 
of the two Possible correlation ratios (the re- 
gression of the WAIS variable on Matrices) 
Was computed for each of the comparisons 
between Matrices and the 14 WAIS variables 
and tests for departure from linearity of re- 
Sression were done (5, pp. 268-275), Using 
the 05 Probability value as the level of sig- 
nificance, it was Possible to retain the hy- 
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Table 2 


Correlation of Modified Matrices with Each of the 
WAIS Subtests and with the Verbal, Per- 
formance and Full Scale WAIS Scores 


Tests N Mean SD r 
Matrices 82 20.70 515. 
WAIS: 

Information 82 19.42 4.63 .506 
Comprehension 82 20.10 4.17 598 
Arithmetic 82 13.38 3.03 452 
Similarities 77 15.92 3.76 All 
Digits 82 11.73 2.31 282 
Vocabulary 82 52.02 16.55  .480 
Digit Symbol 82 50.50 12.77 .538 
Picture Completion 78 14.68 3.82 .602 
Block Design 82 33.43 8.72 642 
Picture Arrangement 67 26.24 5.98  .617 
Object Assembly 55 30.66 7.71 366 
Verbal Scale Score 82 71.28 13.75 .584 
Perform. Scale Score 82 51.54 10.04 .705 
Full Scale Score 82 122.82 20.92 .721 


Note.—T. Matrices and WAIS subtest statistics are based 
on Be saa MaG for Verbal, Performance, and Full Scale 
are based on summed scaled scores. In phone imahpones tn 
which the entire WAIS was not given (26 cases), the la 
three scores were secured by prorating, 


pothesis of linear regression in all compari- 
sons except that between Matrices and Ob- 
ject Assembly. The correlation ratio expres- 
sing the regression of Object Assembly on 
Matrices is .545. Table 2 shows the means 
and standard deviations of the 15 test vari- 
ables and the correlation coefficients (rs) of 
Matrices with each of the 14 WAIS variables. 

Reliability. In this study a time limit was 


Table 3 


The Number of Individuals Passing Each of the 30 Items of the Modified Version of the 
Progressive Matrices (1938) 


WN = 82) 
Set A Set B Set C Set D Set E 
aoa B 

Item No. Item No. Item No. Item No. Item No. 
No.* passing No.* passing No.* passing No.* passing No.* passing 

1 8 2 81 1 78 1 82 2 46 

3 82 4 738 3 75 4 70 3 53 

5 80 5 4 6 67 6 67 5 39 

7 74 8 56 8 60 8 53 7 10 

10 ë 72 9 55 10 33 9 51 9 2 

1 53 12 36 1136 u 23 12 4 


* The item numbers are those of the standard version of Progressive Matrices (1938), 
N did not att 


ote.—One subject 
not attempt E-5, E-7, E-9, and E-11 


rmpt item D-9; two did not attempt D-11} thios did not attempt E-2 and E-3; and six did 
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imposed and six subjects did not attempt all 
the items. If the scores of these six subjects 
are excluded, the Kuder-Richardson reliabil- 
ity coefficient is .878; the reliability coeffi- 
cient for the total sample is .864. 

Item difficulty. The item data were ana- 
lyzed to secure information concerning the 
difficulty level of each item. Table 3 sum- 
marizes the analysis. 

It is apparent from the data in Table 3 
that there are displacements in item difficulty 
both between sets and within sets and that 
Set D as a whole is not more difficult than 
Set C. These two findings are not unique to 
the modified Matrices. Halstead (3) reports 
similar findings for the standard form of the 
test. 

Discussion 


The reliability coefficient of the modified 
Matrices is encouraging since it compares fa- 
vorably with the Kuder-Richardson coeff- 
cient of .90 reported by Sinha (10) for the 
60-item form of the test and compares even 
more favorably with the reliability coefficients 
(Spearman-Brown, split half) of the various 
subtests of the WAIS Performance Scale (11, 
p. 13). Since the correlations with the WAIS 
indicate that Matrices has more in common 
with the Performance Scale than with the 
Verbal Scale, the hypothesis that Matrices 
can be useful in the evaluation of brain dam- 
aged individuals is given support. The low 
ceiling of the modified Matrices, however, is 
a severe restriction on its usefulness and ef- 
forts should be directed toward remedying 
this defect. The test ceiling could be raised 
by either of two methods: (a) by placing a 
greater premium upon speed of performance 
or (b) by increasing the difficulty range of 
Pain n alternative has been utilized in 
the WAIS. Study of the raw score conversion 
table (11, p. 77) indicates that a major por- 
tion of the ceiling of the Performance Scale 
tests is derived from speed of performance. 
When one’s aim is to predict level of intel- 
lectual function in the daily activities of 
adults, however, ceiling secured by empha- 
ing speed is of questionable value. There isa 
considerable body of evidence which indicates 
that, even in a physically intact adult popu- 


lation, emphasis on speed has an adverse ef- 
fect upon the accuracy with which a score re- 
flects a point on a single dimension and the 
predictive usefulness of scores under speed 
conditions is attenuated. Guilford in review- 
ing empirical investigations of the effects of 
speed conditions on intelligence test perform- 
ance comments: “. . . speed conditions where 
items are not very easy open the door to 
many uncontrolled determiners of individual 
differences in scores” (2, p. 369). Analysis of 
the physical and psychological factors which 
can influence intelligence test performance 
suggests that in a patient population per- 
formance on speeded tests is particularly 
susceptible to the vitiating effects of irrele- 
vant situational variables. 

The considerations discussed above indi- 
cate that the second method of raising the 
ceiling of the modified Matrices, i.e., increas- 
ing its difficulty, is the method of choice. 
Progressive Matrices (1938) in its standard 
form does not have sufficient ceiling to dis- 
criminate among a superior group. Study of 
the item analysis done in this study and that 
of Halstead (3) suggests, however, that a 
selective method of item choice could yield a 
modification with adequate ceiling for popu- 
lation samples having mean IQs similar to 
that of the sample used in the study reported 
here. The item analyses indicate that a num- 
ber of the easy items should be eliminated 
and replaced with items at varying levels of 
difficulty, the selection of items being directed 
toward raising the ceiling and smoothing the 
ascent along the scale of difficulty. 


Summary 


The reliability, item difficulty, and correla- 
tion with the WAIS of a modified (30 item) 
form of Progressive Matrices (1938) was in- 
vestigated. The following findings are re- 
ported. 

1. The Kuder-Richardson reliability coeffi- 
cient for the modified version of Matrices is 
864 (N = 82). 

2. Correlation of modified Matrices with 
the WAIS Performance Scale score is -705; 
with the Verbal Scale score, .584; and with 
the Full Scale score, .721. The difference in 
the correlations with the Verbal and Perform- 
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ance Scales suggests that Matrices may be a 
useful complement to the Verbal Scale in 
evaluating the intellectual function of brain 
damaged individuals. 

3. A severe shortcoming of modified Ma- 
trices was its low ceiling. The score distribu- 
tion showed significant departure from both 
normality and symmetry. An analysis of item 
difficulty indicates that a reduction in the 
number of easy items and their replacement 
with items of greater difficulty probably would 
result in a modification having more adequate 
discriminative power. 


Received May 3, 1956. 
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The Use of Doppelts Short Form of the Wechsler 
Adult Intelligence Scale with Psychiatric Patients 


Tom D. Olin and Marvin Reznikoff 
The Institute of Living 


In a recent paper, Doppelt (1) proposed 
an abbreviation of the Wechsler Adult In- 
telligence Scale (WAIS) which would pro- 
vide the examiner with an adequate estimate 
of the subject’s IQ without giving all eleven 
subtests of the scale. His short form is com- 
posed of four subtests: Arithmetic, Vocabu- 
lary, Block Designs, and Picture Arrange- 
ment. Using the population in the national 
standardization of the WAIS (2), he ob- 
tained a correlation coefficient between the 
sum of these four subtests and the Full Scale 
score of approximately .96. Regression equa- 
tions were computed for various age groups 
for use in predicting the Full Scale score 
from the sum of the four subtests. The stand- 
ard error of estimate was determined to be 
approximately 7 scaled score points (4.2 IQ 
points). To check his predictive equations, 
Doppelt applied them to two groups of sub- 
jects not used in his statistical analysis and 
found that in 71% of his cases the differences 
between his obtained and estimated Full 
Scale scores were within one standard error 
(= 7); two standard errors (+ 14) con- 
tained 96% of the cases. ; 

Since Doppelt’s procedure appears to per- 
mit an adequate estimate of the IQ in a rela- 
tively brief period of time, it is potentially of 
considerable usefulness in many situations. 
The question arises, however, whether such 
an abbreviated procedure, derived from data 
obtained from a presumably normal popula- 
tion, can be employed reliably in an emo- 
tionally disturbed population. There is con- 
siderable evidence, for instance, that a aa 
phrenic patients show pr ee n 


purpose of this study, therefore, is to deter- 
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mine whether the Doppelt short form of the 
WAIS would provide adequate estimates of 
the Full Scale scores in a disturbed popula- 
tion. 


Procedure and Results 


The subjects included in this study con- 
sisted of all of the patients at a private psy- 
chiatric hospital who had been given the full 
WAIS as part of a routine psychological ex- 
amination. Six patients having final diagno- 
ses of a primarily organic nature were elimi- 
nated from the group, however, in order to 
limit the subject population to functional 
disorders. The remaining subject group con- 
sisted of 107 patients having varied schizo- 
phrenic and neurotic diagnoses and manifest- 
ing a wide range of behavioral deviations, 
Forty-four per cent of the patients were men, 
and 56% were women. The mean age was 
36.5 years and the range was 16 to 69 years. 
WAIS IQs ranged from 78 to 135 with a 
mean of 108. Approximately 78% of the 
cases fell above an IQ of 100. 

The four weighted scores were summed and 
the Full Scale scores were estimated using 
the simplified regression equations in Table 4 
of Doppelt’s article. A correlation of .925 was 
found between the obtained Full Scale score 
and the sum of the four subtests; the stand- 
ard error of estimate was computed to be 7.9 
points. These results are in good agreement 
with Doppelt’s correlation of .96 and stand- 
ard error of estimate of 7 points. It is to be 
noted that the standard deviation of the ob- 
tained Full Scale scores of this disturbed 
group was 20.79 compared with Doppelt’s 
figure of 25. This lower standard deviation 
partially compensates for the lower correla- 
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tion in the computation of the standard error 
of estimate. 

Table 1 presents the distribution of differ- 
ences between the obtained and predicted 
Full Scale scores for the 107 patients in- 
cluded in this study. The distribution closely 
approximates Doppelt’s findings; 70.1% of 
the cases fall within Doppelt’s standard error 
(7 points) as compared with 71% of Dop- 
pelt’s samples. The higher mean deviation of 
+ 2.5 as compared with Doppelt’s values of 
— 0.1 and — 0.5 suggests that Doppelt’s re- 
gression equations tended to underpredict the 
IQs in this emotionally disturbed group. As 
can be seen from Table 1, approximately 
twice as many cases were underpredicted as 
were overpredicted. Accuracy of prediction 
might be improved for the disturbed group 
by utilizing Tegression equations computed 
specifically for such a population. 

Table 2 presents the differences in actual 
IQ points between the obtained and predicted 
Full Scale IQs. Seventy-two per cent of esti- 
mated IQs deviate 4 points or less from the 


obtained IQs. In 9.3% of the cases there 
was perfect prediction, 


Table 1 


Distribution of Differences Between Obtained Full Scale 
Scores and Predicted Full Scale Scores * 


i Difference Frequency Percentage 
+15 to +21 


4 3.7 

+ 8to+14 22 20.6 
+1to+7 46 43.0 
0 4 3.7 

— Tto= i 25 23.4 
=l4 to — 8 5 4.7 
—21 to —15 1 9 


core tained Full Scale score minus the predicted Full Scale 


Table 2 


Distribution of Differences Between Obtained Full Scale 
IQ and Predicted Full Scale IQ* 


Difference Frequency Percentage 

+13 to +16 1 0.9 
+ 9 to +12 2 17 
+ 5to+8 20 18.7 
+1to+4 42 39.3 

0 10 9.3 
—4to-1 25 23.4 
= 8to-— 5 5 4.7 
—12to — 9 2 LT 


* Obtained Full Scale IQ minus the predicted Full Scale IQ. 


Summary 


An attempt was made to check the ac- 
curacy of Full Scale score prediction using 
the WAIS short form as proposed by Dop- 
pelt with a psychiatrically disturbed popula- 
tion. A correlation of .925 was obtained be- 
tween the sum of the four subtests and the 
obtained Full Scale score. The standard error 
of estimate was computed to be 7.9 scaled 
score points. The results suggest that the 
Doppelt short form yields reasonably accu- 
rate prediction of IQs in a disturbed popu- 
lation. 


Received April 2, 1956. 
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Performance of Children on the Davis-Fells Games 
and Other Measures of Ability 


Mary I. Love and Sylvia Beach 


Cincinnati Public Schools 


Social scientists who work with children 
from a wide variety of backgrounds have long 
felt a need for tests in which verbal material 
and specific cultural factors would be of mini- 
mal importance. Existing group tests of in- 
telligence, with their frequent emphasis on 
vocabulary and reading, have appeared to im- 
pose a particular hardship on children with 
reading disabilities or those from groups 
where verbal communication is limited by 
isolated or unusual geographical or cultural 
settings. 

Several “culture-fair” tests have been de- 
veloped to try to circumvent these difficulties. 
The recently published Davis-Eells Test of 
General Intelligence or Problem-Solving Abil- 
ity (1) is one devised by a sociologist and an 
educational psychologist. 


The test is presented in such a forra as to require 
the pupil to understand and respond to a variety of 
verbal material, but is entirely free of any reading 
requirements, . . . The situations represented by the 
items deal with kinds of problems such that chil- 
dren from different kinds of backgrounds will have 
more nearly equal opportunity for familiarity with 
the necessary experiences. . . . The verbal material 
used in administering the test has been carefully 
screened ... to eliminate words (or grammatical 
constructions) which would be more familiar in 
some groups than others (1, p. iv). 


Method 


Children in the 4th grades of the Cincinnati 
Public Schools are regularly given one of the 
traditional group tests of intelligence. In the 
school year 1953-54 they received Series D of 
the Sixth Edition of the Kuhlmann-Anderson 
Test (2), scored by the 1942 revised norms, 
administered by examiners from the school 
system’s Appraisal Service. For purposes of 


comparison, 4th grade children from eight 
schools were given the Davis-Eells Games, 
Elementary Form A, administered by a school 
psychologist. The Kuhlmann-Anderson Tests 
were given in the fall, the Davis-Eells Games 
from five to eight months later. 

The schools selected for the present study 
included four with populations predominantly 
from a lower socioeconomic level, two from 
an upper socioeconomic level, and two more 
nearly representative of a middle group. Cin- 
cinnati’s topography lends itself to defining 
such levels almost geographically, with peo- 
ple of the lower economic levels living in 
Basin, or River-Bottom area, representatives 
of the middle group living on the first hills, 
and members of the higher group living far- 
ther out on the hills. There are, of course, 
exceptions, but the predominant level in each 
school district could thus be categorized. A 
total of 469 children received both tests. At 
the time they were given the Kuhlmann-An- 
derson Test the children ranged in age from 
8-6 to 12-4, with a median age of 9-7. The 
children were given the Davis-Eells games 
when their age range was 8-11 to 12-10, 
with a median age of 10-1. Table 1 shows 
the distribution according to socioeconomic 


- level, race, and median age when tested. 
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In the fall of 1954, 341 of the same chil- 
dren were given the California Reading Test, 
Elementary-BB (5). Although the number of 
children was smaller, the representative per- 
centage of the groups was not greatly differ- 
ent with 33% of children from the upper 
level, 27% from the middle, and 40% from 
the lower. 

As a subsidiary study, 110 3rd grade chil- 
dren in one of the predominantly lower-class 
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Table 1 
Distribution of Children Given Kuhlmann-Anderson and Davis-Eells Tests 


According to Age, Race, 


and Socioeconomic Level 


Socio- 


Number Number Percent Median Median 
economic of of of age for age for 
level white Negro Total total K-A D-E 
Upper 137 137 29 9-5 9-10 
Middle 56 62 118 25 9-7 10-2 
Lower 97 117 214 46 9- 10-3 
Total 290 179 469 


schools were given the Davis-Eells Games and 
the California Test of Mental Maturity (4) 
when their median chronological age was 9-0. 
All of these children were Negroes. 


Results 


; Kuhlmann-Anderson scores are translated 
into mental age units, and thence into intelli- 
gence quotients. The authors of the Davis- 
Eells Test recommend that the measure de- 
rived from their test be called an “Index of 
Problem Solving Ability,” or IPSA, but add, 
“Since the Index of Problem Solving Ability 
1S Computed in the same way that many test 
makers compute an intelligence quotient, 
those users of the Davis-Eells Test who pre- 
fer the term ‘IQ’ may quite appropriately ap- 
ply this term to what the authors prefer to 
call the Index of Problem Solving Ability” 
(1, p. 29). In this Paper the term IQ is used 
to refer to the Performance on either test un- 
der consideration. 

The mean Kuhlmann-Anderson IQ of the 
469 children studied was 100.7 with a stand- 
ard deviation of 16.05. The median was 100.3. 
On the Davis-Eells the mean IQ was 90.1, 


with a standard deviation of 15.96. The me- 
dian was 88.2. The correlation between the 
tests was .53, significant beyond the 1% level. 
Table 2 shows the distribution of means ac- 
cording to socioeconomic level, 

Table 2 indicates that the children in the 
present study consistently rated higher on the 
Kuhlmann-Anderson than on the Davis-Eells. 
This trend was noted at all levels, with the 
difference being greater at the upper level 
(both IQ and socioeconomic) than at the 
lower. Differences between all pairs of means 
were statistically significant with a ¢ beyond 
the 1% level in every instance. That is, dif- 
ferences between the means of different socio- 
economic groups on the same test were sig- 
nificant, as were differences between the 
means on the two tests. 

The correlation of the reading tests scores 
with the IQs appears in Table 3. 

Children also rated consistently lower on 
the Davis-Eells than on the California Test 
of Mental Maturity as shown in Table 4. In 
this case also the differences between the 
means were significant. 


Table 3 


Ent a Table 2 Correlations Between Scores on the California Reading 
Distribution of Means on the Davis-Eells and Test (Elementary-BB) and IQs on the Davis- 
Kuhlmann-Anderson Tests Eells and Kuhlmann-Anderson 
Socio- Kuhlmann-Anderson Davis-Eells Socio- : 
economic economic Kuhlmann- Davis- 
level Mean SD Mean SD level Anderson Eells 
Upper 159 11.1 1008 12.7 Upper 66 S 
Middle 99.9 13.2 88.3 15.2 Middle -69 325 
Lower 91.8 12.9 84.2 13.7 Lower -69 23 
All cases 100.7 16.1 90.1 16.0 All cases 80 57 
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Table 4 


Means and Correlations of the Davis-Eells and the 
California Test of Mental Maturity 


Test Mean r 
Davis-Eells 82.5 
California Test of Mental Maturity r 
Language Factors 87.2 Al 
Nonlanguage Factors 91.9 61 
Total 89.2 .60 


Discussion 


Determining the nature or direction of 
characteristic variations in performance on 
the Davis-Eells and the traditional tests was 
the principal purpose of this study. The ra- 
tionale of the Davis-Eells, as expressed by its 
authors, implies that existing tests of intelli- 
gence tend to penalize some groups while the 
Davis-Eells cuts across verbal and cultural 
factors. The uniform tendency for ratings on 
the Davis-Eells to be lower than the ratings 
on the other tests does not confirm this hy- 
pothesis. The hypothesis was also somewhat 
challenged by the fact that the differences 
between the mean performances of children of 
lower, middle, and upper socioeconomic levels 
were statistically significant on the Davis- 
Eells, even as on the Kuhlmann-Anderson. 

One possible explanation of the lower rat- 
ings on the Davis-Eells is that the stand- 
ardization group was more highly selected 
than the test’s authors intended. Another pos- 
sibility would seem to lie in the physical 
structure of the test. The pictures are care- 
fully drawn with many minute details which 
might tend to handicap children with uncor- 
rected visual difficulties, of whom there are 
doubtless more among lower socioeconomic 
levels than among higher. The two test ques- 
tions which an item analysis showed to be 
most frequently missed by 4th graders ver 
two hinging on particularly small pictoria 
details (numbers 19 and 10). p 

It would also seem that the necessity for 
constant attentiveness to spoken instructions 
may impose a hardship on some children. 
none of the highly verbal “money 
the 7 most difficult, 


e » on 
although there were 39 “no answers 


However, 


these 8 problems, as opposed to 30 “no an- 
swers” on the 54 remaining test items. 

Another possible explanation for the lower 
ratings on the Davis-Eells might lie in the na- 
ture of the test material. While one of the 
merits of the test is its use of familiar prob- 
lem situations, one wonders whether this 
might not also be a source of weakness, in 
that some of the depicted episodes may be so 
realistic as to carry a high emotional content 
for the child and create something of a shock 
situation which would handicap his function- 
ing. Further work in this area might be profit- 
able. Studies at the Wayne County Training 
School have suggested that “the emotional 
loading of some of these pictures and the 
ambiguity of others would stimulate the ex- 
pression of our subjects’ needs, thus interfer- 
ing with the required intellectual activity” 
(3, p. 497). 

The correlations of the Kuhlmann-Anderson 
and the Davis-Eells with the California Read- 
ing test were interesting. There was an un- 
usually high correlation between the Kuhl- 
mann-Anderson and the reading test for all 
groups combined, and a moderately high cor- 
relation for the separate groups. The Davis- 
Eells correlation with reading achievement 
was in every instance lower, and with the 
lowest socioeconomic group it was extremely 
low, suggesting that the test may measure 
mental ability independently of reading abil- 
ity for that group where stimulation toward 
reading achievement at home is not so gen- 
erally a part of the culture. For the middle 
and upper groups, which may be assumed to 
have more cultural pressure toward reading, 
success on the Davis-Eells is about as much 
related to reading achievement as success on 
the more verbal Kuhlmann-Anderson. 


Summary 


The Davis-Eells Games were administered 
to 579 third and fourth grade children, 469 
of whom were given the Kuhlmann-Anderson 
Tests, and 110 of whom. were given the Cali- 
fornia Test of Mental Maturity. The mean 
scores on the Davis-Eells Games were signifi- 
cantly lower than the mean scores on either 
of the traditional-type tests of intelligence. 

The California Reading Test was also given 
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to 341 children receiving both the Davis- 
Eells and the Kuhlmann-Anderson, and the 
results were correlated. There was a high 
positive correlation between reading achieve- 
ment and performance on the Kuhlmann-An- 
derson for all socioeconomic groups, and a 
relatively high correlation between reading 
scores and Davis-Eells performance for mid- 
dle and upper socioeconomic groups. With the 
lowest socioeconomic group success on the 
Davis-Eells was only slightly related to read- 
ing ability, Suggesting that for children of 
this category the test is divorced from read- 


ing achievement, if not from other cultural 
determinants, 
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Prediction from the Cattell Infant Intelligence Scale 
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and Irving D. Goldberg 
New York State Department of Health 1 


A number of investigators have shown that 
various infant psychometric tests have poor 
prognostic value (1, 2, 3, 7). Cattell reported 
some encouraging data derived from her 
standardization of the Cattell Infant Intelli- 
gence Scale (CIIS) particularly with respect 
to infants eighteen months of age and older 
children. She cautioned against using scores 
from very young infants for predictive pur- 
poses. Gallagher (5) reports a high correla- 
tion between nine-month and sixteen-month 
CIIS scores on the same infants. The wide- 
spread use of the CIIS and the paucity of 
data on its predictive efficiency call for fur- 
ther research. 

The primary purpose of this report is to 
present some findings related to the predictive 
value of the CIIS when administered to six 
month-old infants. These findings were de- 
rived from a longitudinal study being con- 
ducted at Children’s Hospital, Buffalo, New 


York. 


Description of the Child Growth Study 


The plan and method were described in de- 
tail in a previous report (8). Briefly, the 
study was planned to evaluate the effects of 
trauma to the central nervous system, anoxia, 
and other possible influences on gestation, ea 
livery, the neonatal period, and the pre-schoo 


child. 


The Study subj 
Hospital, Buffalo, 


ects were infants born at Children’s 
New York within the period from 


3 ‘Id Growth Study from which these data 
Tr aes developed through the mopa iE 
efforts of the Department of Pediatrics and ban e 7 
rics of the University of Buffalo School of v cini 
and the New York State Department of Health. 


September 1949 to December 1953. Three children 
were born in other hospitals within the city. Ini- 
tially, there was no preferential selection of cases 
for inclusion in the Study, except that for con- 
venience the majority of children included were born 
during the daytime hours. Since one purpose of the 
study was concerned with neurological factors, some 
preferential selection was given subsequently to chil- 
dren born by Caesarean section, induction, labors of 
long duration, and other potential or definite varia- 
tions from a normal delivery. 

All children observed received a physical examina- 
tion and when possible an electroencephalogram 
shortly after birth. The general course of behavior 
during the neonatal period was noted and follow-up 
visits were scheduled at 6, 12, 18, 24, 36, 48 and 60 
months of age. At each visit the child was sched- 
uled to receive a physical and neurological exami- 
nation, an electroencephalogram, and an intelligence 
test, except at 18 months of age when only the psy- 
chological examination was routinely scheduled. In 
addition to intelligence testing, the psychologist spent 
a portion of the visit obtaining data from the parents 
concerning family life, the child’s behavior and ad- 
justment. 

The CIIS was administered to children from 6 
through 24 months of age and the Revised Stanford 
Binet Form L (SB) thereafter. Over the period of 
the study, four psychologists have been employed at 
different times. At present all scheduled examinations 
through 18 months of age are completed, and some 
children admitted early in the study have been ob- 
served for five years. 

The great majority of children were tested within 
one month of the age designated. The six-month 
group included children seen between 5 and 7 months 
of age and a few at 8 months of age; the 12- through 
36-month groups did not deviate by more than two 
months from the planned date of visit (except for 
one child aged 27 months who was included in the 
24-month group); only for the 48- and 60-month 
examinations did the groups contain children who 
deviated more than two montis. 

In scoring the CIIS the CA was calculated to the 
nearest tenth of a month from date of birth. Dis- 
tinction was made for this report between “valiq” 
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and “questionably valid” tests according to the ex- 
aminer’s judgment. A test was termed questionably 
valid when the examiner felt a child was not doing 
his best for reasons of irritableness, fatigue, lack of 
cooperation, etc. Except where otherwise noted, only 
tests rated as “valid” were used in the analysis of 
our data. The study children are somewhat above 
the mean intellectually. This is shown by Fig. 1, 
which presents the distribution of Binet IQs ob- 
tained from the first 50 children who had com- 
pleted the five-year examination. 


In studying the effects of gestational ma- 
turity on the CIIS infants were classified as 
premature if their gestation was less than 42 
weeks and their birthweight was 5 pounds 8 
ounces or less. Mature infants were defined as 
weighing more than 5 pounds 8 ounces and a 
gestational age of less than 42 weeks. The re- 
mainder, those with a gestational age of 42 
weeks or over, comprised the postmature 
group. 

Results 


The mean IQ obtained from all children 
who had a “valid” psychological examination 
at 6 months of age and/or 12, 18, 36 months 
of age is shown in Table 1 according to ma- 
turity classification. The mean IQ of prema- 
tures was a significantly lower (p < .001) 
than that of the matures or postmatures at 6 
months of age. Further, the mean IQ of the 
postmatures was significantly higher (p= 
-04) than that of the matures. 

The effects of prematurity on IQ were still 
highly significant at 12 months of age though 
somewhat diminished. The mean IQ scores for 
matures and postmatures were identical at 12 
months of age. In the 18- and 36-months ex- 


12 


Number of Children 


90 91 104 mi ue 125 132 139 wó 153 


Fig. 1. Distribution of Stanford-Binet five-year 
IQs for 50 children with completed examinations. 
Mean IQ = 117.2, SD = 12.7. 
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Table 1 
IQs at 6, 12, 18, and 36 Months 


Maturity 
Pre- Post- 
Age mature Mature mature 
6 months 
Number 40 252 41 
Mean IQ (Cattell) 91.2 106.4 109.6 
SD 14.6 8.9 9.2 
12 months 
Number 29 190 34 j 
Mean IQ (Cattell) 94.4 102.2 102.4 
SD 7.4 8.6 9.6 
18 months 
Number 24 138 a 4 
Mean IQ (Cattell) 102.4 105.3 102 2 
SD 8.3 8.4 6.3 
36 months 
es h b ae 
M IQ (Binet 117. il . 
a allied 11.5 13.9 9.5 
aminations, however, between-group differ- 


ences were not significant. The increment in 
IQ for prematures after 6 months is not due 
to any bias of selection, since this finding je 
upheld when the same children are studied a 
successive examinations. While the premature 
tended to score lower than the full-term 10- 
fant at 6 months, this is no longer true at 18 
mths of age. k 
mIo aber some indication of the stability 
of the IQ at 6 months, those children a 
whom a “valid” test was obtained at bot 
6 and 36 months were studied. The scores 
achieved at these two age levels were sub- 
divided into three broad groups: under 91, 
91 through 116, and 117 or over (apponi 
mating Terman’s classification). This divisio 
provides grouping of below-average, le 
and above-average scores. A contingency table 
was prepared using these classifications to de- 
termine the extent to which children with low, 
average, and high scores at 6 months of age 
remained in the same category at 3 years (o 
age. There were 57 children (excluding a 
prematures) who, had “valid” scores on bot n 
the 6- and 36-month tests. The distribution 
scores is given in Table 2. 
s The A between the 6-month and 
36-month scores is clear. The 6-month score 
is generally lower than that which a given 
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Table 2 
Change in IQs at 6 and 36 Months of Age 


36-Month IQ (Stanford-Binet) 


6-Month IQ Under 117 and 
(Cattell) 91 91-116 over Total 
Under 91 — 2 ms 2 
91-116 1 14 25 40 
117 and over — 6 9 15 
Total 7 22 34 57 


child attained at 36 months. Neither of the 
two children with an IQ under 91 at 6 months 
remained in that category at three years of 
age, and both of these children had achieved 
a score of more than 110 at the later age. 
Likewise, more than 60 per cent of those 
with an “average” IQ initially had scored 
“high” at 36 months. Further analysis of 
these data indicates that this shift in scores 
was not confined to IQs around the group 
divisions, but rather was widespread through- 
out the entire range of scores. 

When the distribution of scores, excluding 
prematures, was divided into thirds at 6 
months and at 36 months, it was found that 
only 40 per cent of the children remained in 
the same third of the distribution. For ex- 
ample, of the 19 cases comprising the upper 
third at 6 months only six were included in 
the upper third at 36 months, with eight of 
the remaining thirteen lying in the lowest 
third at 36 months. 

In addition, the Pearsonian r was calculated 
for all possible age combinations of the 
“valid” IQ scores between 6 and 48 months. 
Because of the extreme effect of prematurity 
on 6- and 12-month scores, children born pre- 
maturely were excluded from consideration s 
all correlations with the 6- and 12-mont! 
IQs. The correlation coefficients are presented 
7 oe ae table it is clear that the correla- 
tion of the 6-month IQ with that at later ages 
is low. In fact, only in correlation of the 6- 
with the 12-month IQ was the coefficient ne 
tistically different from zero. The see i 
the significant r in the 6- by 12-mont + 
relation may be due to the similarity 0 E 
test items at these age levels. In contrast, the 


correlation coefficient at every other age save 
one (12—24 months) was significant at the 
.01 level. The absence of meaningful correla- 
tion of the 6-month IQ with those at the 
subsequent ages demonstrates the inadequacy 
of the 6-month IQ as a predictor of subse- 
quent test scores. 

In an evaluation of the “trend” in these 
correlation coefficients, consideration must be 
given to possible bias in the selection of cases. 
All tests were completed for a limited number 
of children; hence, the difference in numbers 
available for study at various ages. Thus, the 
68 children used in the comparison of the 36- 
and 48-month IQ are those born at the begin- 
ning of the study, while the 191 children com- 
pared at 6 and 12 months are those drawn 
from the entire study population. 

Excluding the prematures, there were 16 
children who received all psychological tests 
from 6 through 60 months of age and who 
were considered to have given a “valid” score 
at each test. An additional group of 18 chil- 
dren had completed all tests, but for each 
child in this group at least one of the seven 
tests was considered to be of only “question- 
able validity.” Although the difference be- 
tween these two groups in the mean IQs at 


Table 3 
IQ Correlation Coefficients 


Ages compared 


(months) N r 
6X 12 191 38** 
6X18 136 it 
6X 24 103 06 
6 X 36 57 .23 
6X 48 61 20 

12 X 18 138 40** 
12 X 24 104 Al 
12 X 36 58 K X ia 
12 X 48 60 .40** 
18 X 24 100 SY a 
18 X 36 62 .50** 
18 X 48 64 Agee 
24 X 36 61 60** 
24 X 48 59 .40** 
36 X 48 68 .69** 


#* Significant at the .01 level, 
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Table 4 


Correlation Coefficients of IQs for 34 Children Who 
Completed All Tests from 6 Through 60 Months 


Ages Ages 
compared compared 
(months) r (months) r 
6X 12 me 18 X 24 -48** 
6X 18 32 18 X 36 21 
6 X 24 .08 18 X 48 23 
6 X 36 .05 18 X 60 .19 
6 X 48 .26 
6 X 60 21 24 X 36 46** 
24 X 48 -38* 
12 X 18 -36* 24 X 60 24 
12 X 24 —.01 
12 X 36 -35* 36 X 48 ware 
12 X 48 50¥* 36 X 60 EY faded 
12 X 60 aun 
48 X 60 .69** 


* Significant at .05 level, 
** Significant at .01 level, 


each age was not statistically significant, gen- 
erally the “questionably valid” scores ap- 
peared to be slightly lower than the scores 
Considered to be “valid.” Table 4 presents 
the correlation coefficients (similar to those 
of Table 3 with the addition of the 60-month 
Correlations) based on the combined group of 
34 children, Although some slight error may 
be present due to the inclusion of the “ques- 
tionably valid” scores, it is probable that this 
would be offset by increase in size of the 
combined group. 

The patterns observed in Table 3 are mani- 
fest in Table 4, Starting with the 18-month 
IQs, the greater the intertest interval the 
lower the correlation, As was also evident in 
Table 3, an appreciable correlation is first 
noted when the 36-month IQ is compared 
with the score at a subsequent age. This may 
be related to the fact that the SB was used 
beginning with age 3. The lack of significance 
of coefficients in Table 4, which are signifi- 
cant in Table 3 is probably due to the smaller 
numbers involved. However, Table 4 con- 
firms the previously noted absence of signifi- 
cant correlation of 6-month IQs with those at 
later ages. 

An analysis of variance of the IQ for the 34 
children who had completed all tests through 
5 years of age, disclosed no difference among 
the mean scores (SB) at 36, 48, and 60 
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months of age. Likewise, no differences were 
found among the mean scores obtained from 
the CIIS at 12, 18, and 24 months of age. 
However, the over-all mean of the SB scores 
(36-60 months) was higher and statistically 
different at the .01 level from the mean of the 
CHS scores (12—24 months). It would appear 
that those functions tapped by the infant 
scale are different from and not directly pre- 


dictive of the functions measured by the SB. 


Although it has been stated that the ie 
(4) is a downward extension of the SB, bee 
analysis of our data does not bear this pine 
Further, the mean score at 6 months was SE 
nificantly different (p > .01) from the me 
scores 12, 18, and 24 months. This suppor : 
the conclusion already noted regarding the in 


adequacy of the 6-month IQ as a predictive © 


index of later test scores. 

tiene of the mean IQ at each age ‘an 
the 34 children with all tests completed E 
depicted in Fig. 2. The similarity of saed 
scores between 12 and 24 months, the E 
larity of mean scores between 36 ane A 
months, the difference between these i 
groups, and the uniqueness of the 6-mo 


| 
IQ are all readily apparent. The fact that th 


upper 99 per cent confidence limit at >i 
months overlaps lower limits at 36, 48, <n 
60 months should not be construed to mews 
that the differences between the Camon 
score and those at later ages are not sign) a 
cant statistically. The appropriate test, maa 
takes into account the fact that the san 
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Fig. 2. Mean IQs for 34 children who ee 
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children are involved throughout, does dem- 
onstrate significance. 


Discussion 


The findings reported above confirm the 
conclusions of some previous investigators 
that the intelligence scores of children below 
two years of age are of little value in predic- 
tion of subsequent IQ scores. The correlations 
obtained in the present study are consistently 
lower than those reported by Cattell at the 
same age levels. Despite the relatively high 
correlation obtained at the most proximal age 
comparison (24 by 36 months), Fig. 2 dem- 
onstrates the wide discrepancy between the 
mean IQs at these two ages. Thus, it is clear 
that a relatively high correlation coefficient 
between these two tests does not permit a 
direct prediction of SB scores from earlier 
Cattell scores. 

The hypothesis has been advanced by sev- 
eral investigators that the present validity of 
infant testing lies in its differentiation of the 
extremes in a population (4, 7). Because of 
the absence of extremely low scores in our 
sample, we were unable to test this hypothe- 
sis. It is possible that the Cattell provides 
some measure of maturational level, but it 
does not appear to be directly predictive of 
those intellectual functions measured at later 
ages by the SB. 

Several aspects of this problem bear fur- 
ther investigation. It is possible that an item 
analysis may produce a selection of items 
which will have better predictive value than 
those which are now in use. Also, an analysis 
of individual growth patterns might help to 
determine whether or not intellectual matura- 
tion is a curvilinear function. The data pre- 
sented in this report will again be analyzed 
when all examinations have been completed. 


Summary 


A relationship was found to exist between 
IQ and fetal maturity among children with a 
CA of approximately 12 months. However, 
the effect of prematurity on test scores com- 
pletely disappeared by 18 months of age. 

The CIIS was employed at the age of 6, 
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12, 18, and 24 months, and the Revised Stan- 
ford-Binet (Form L) at 3, 4, and 5 years. 
An analysis of variance disclosed no differ- 
ences among the mean scores obtained from 
the Cattell at 12, 18, and 24 months, nor 
among the SB mean scores at 36, 48, and 60 
months. However, the differences in the scores 
at 6 months when compared with the scores 
at 12, 18, and’24 months were significant. 
Further, the over-all mean of the SB scores 
was higher and statistically different from the 
over-all Cattell mean score. This discrepancy 
may be due to the differences in the structure 
of the two tests. 

In addition to a study of individual scores, 
the data were examined by means of correla- 
tion coefficients and an analysis of variance. 
The results of these analyses showed that the 
CIIS at the age of 6 months was a poor pre- 
dictive index of intelligence. This supports the 
findings of other investigators who used dif- 
ferent psychometric instruments and empha- 
sizes the limitations involved in using the re- 
sults of the 6-month test on an individual 
basis. 
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Recent studies on the relationship between 
various measures of intelligence and the Tay- 
lor Manifest Anxiety Scale (A Scale) have 
yielded contradictory results, Calvin et al, 
(3), Kerrick (6), Matarazzo et al. (8), and 
Grice (5) reported a slight but significant 
negative correlation. Schulz and Calvin EDA 
Farber and Spence (4), and Mayzner et al. 
(10) found no relationship whatever. Taylor 
(12) presented the homogeneity of intelli- 
gence among college students as one possible 
explanation; Klugh and Bendig (7) sug- 
gested that sampling fluctuation is respon- 
sible; Matarazzo et al. (9) were concerned 
with the criterion of intelligence and empha- 
sized that the strength of the relationship is 
only moderate. 

At least three separate but related and un- 
controlled variables emerge from this history: 
(a) the criterion: various measures of intelli- 
gence have been employed; (b) the sample: 
relatively homogeneous Ss have been used 
(college students and selected service per- 
sonnel); (c) psychopathology: presence or 
absence and degree. The present study is an 
attempt to control heterogeneity of intelli- 
gence and presence of psychopathology. 


Method 


Subjects. The Ss were 100 psychiatric aides 
and 100 outpatients who had taken the Wech- 
sler-Bellevue, Form I, and the MMPI dur- 
ing the same test administration. Sex dis- 
tribution was similar with 56 women aides 
and 57 women outpatients. The Ss were 
drawn in alphabetical order from hospital 
files. Background information (sex, age, edu- 
cation) for both groups was available. 


1 This study was conducted at the St. Louis State 
Hospital, St. Louis, Missouri, 
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Procedure. The MMPI answer sheets 
(group administration) were hand-scored in- 
dependently by two clerks for the A scale 
and the Winne scale. The Winne Scale 0 
Neuroticism, which is derived from the 
MMPI Neurotic Triad (Hy, D, Hs), was 
used to provide an independent measure A 
psychopathology. The A-scale items include 
in the MMPI were used because of enida 
that equivalent results are obtained ai 
both A-scale forms, MMPI and Biographica 
Inventory (1, 2). i 

The ll were compared statistically 
on background variables, intelligence, a 
MMPI scores. Correlations were obtained sr 
tween A-scale scores, intelligence, and bac 
ground variables. 


Results 


Equivalence of groups. There were no si 
nificant differences between the two group 
on Wechsler scores and number of years a 
education (Table 1). However, the aifer. 
between mean ages was significant ( $ £ oa 
The two groups are thus similar in inte H 
gence and education. They appear to appro 
mate the general, so gl population w1 
respect to these variables. 

Breine of psychopathology. The mean 
Winne scale score for the outpatient group 
exceeds the cutoff score of 11 which correctly 
identifies approximately two-thirds of ye 
rotic Ss (14). The differences between ai 3 
and outpatient A-scale and Winne scores pee 
significant at the < .001 level. The D 
groups thus represent different ‘a o 
psychopathology and could be labele 
“normal” and “neurotic.” $ 

A-scale scores and background variables. 
The relationships between background vari- 


ee 


$i 
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ables and A-scale scores were not significant & 
‘| 
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Table 1 
Comparison of Variables for Aides and Outpatients 
Aides Outpatients 
Variable Mean SD Mean SD t ? 
Wechsler-Bellevue 
Verbal 97.4 15.7 96.9 11.2 0.3 — 
Performance 101.8 144 100.0 13.5" 09 = 
Full Scale 100.2 14.6 98.4 12.1 1.0 n 
Age, years 34.9 12.5 31.0 9.7 2.4 <.05 
Education, grade 9.9 1.8 10.2 2.0 0.3 a 
Winne Scale 36 19 13.1 5.9 16.0 <.001 
A Scale 91 6.0 232 8.6 14.0 <.001 


for either group (Table 2). The Winne and 
A-scale scores were significantly related as 
might be expected from two measures of 
“anxiety” and the degree of relatedness was 
comparable to past research (13). Six MMPI 
items occur in both scales. 

A-scale scores and intelligence. Zero-order 
correlations were obtained between Wechsler 
and A-scale scores for the aide group. The 
correlations for the outpatient group were 
negative but not significant (Table 2). 


Discussion 


This study is essentially an example of con- 
trol applied to variables which may con- 
tribute to any relationships obtained between 
‘A-scale scores and intelligence. Heterogeneity 
of intelligence and degree of psychopathology 
were controlled by means of sampling pro- 
cedures and size and source of sample. 

The results suggest that intelligence and 


Table 2 


Pearson 7’s Between A Scale and Other Variables 
for Aides and Outpatients 


Group 
pe ee 
Variable Aides Outpatients 
Wechsler-Bellevue 

Verbal —.01 - ee 
Performance —.02 - eH 
Full Scale 00 =u 
Age, years AS -0 
Education, grade i = 


Winne Scale 


Note.—An r of .195 is significant at the .05 level of con- 


fidence. 


Manifest Anxiety are not significantly related 
when measured by the Wechsler-Bellevue 
Form I, and the MMPI 4-scale items. re- 
spectively. The presence of psychopathology 
may contribute to the trend toward a slight 
significant but negative relationship between 
anxiety and intelligence. It should also be 
noted that differences in age and education 
may contribute to the obtained correlations. 


Summary 


The relationship between Manifest Anxiety 
and intelligence was evaluated by means of 
a design which attempted to control such 
variables as heterogeneity of intelligence and 
presence of psychopathology. The Ss, 100 
“normal” and 100 “neurotic,” were similar in 
age and education. They were approximately 
normally distributed with respect to intelli- 
gence test scores. The results suggest that 
considerable caution must be exercised in in- 
terpreting any relationship between intelli- 
gence and Manifest Anxiety. Although no sig- 
nificant relationship was demonstrated ie 
present statistical results illustrate that faulty 
control of relevant variables may have con- 
tributed to some of the apparent significance 
of past research. 
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Recent research by Singer, Meltzoff, and 
others, centering around the Rorschach Hu- 
man Movement response (M), has provided 
experimental support for a theoretical struc- 
ture (14, 26) which considers impulse delay, 
empathic motion perception, fantasy, and 
thinking to be related, and, in some respects, 
interdependent processes. The research has 
suggested that M may be considered both as 
.a product and a measure of the delay func- 
tion of the ego, or of inhibition ability (8, 9, 
11, 17, 18, 19, 20, 21, 22, 23). 

In several studies, JZ has been shown to 
correlate significantly with general intelli- 
gence or aspects of it (1, 2, 7, 24, 27). From 
the point of view of psychoanalytic theory, 
Rapaport, Gill, and Schafer (15) and Fromm, 
Hartman, and Marschak (6) have suggested 
that intelligence test performance may be 
conceptualized in terms of ego functions. The 
correlation of M with intelligence suggests 
that the delay function of the ego, or inhibi- 
tion ability, may be directly involved in in- 
telligence test performance. The present study 
is designed to examine the relationship of a 
specific bit of intelligence test performance 
to M and to another measure of inhibition 
ability. 

he mirror-image N, the symbol for the 
number 2 in the Wechsler-Bellevue Form I 
(25) digit symbol subtest, is reproduced as 
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an NV by approximately ten per cent of our 
clinic population. Analysis of this error sug- 
gests several possibilities. Individuals who 
make this error may not make a necessary 
adjustment in an habituated motor response. 
The stimulus may be perceived correctly, but 
the error reflects poor inhibition of the motor 
act of writing the familiar V. On a perceptual 
level, S may permit closure to take place too 
rapidly so that the normal N is actually per- 
ceived. On a cognitive level, S may respond 
as if there is no difference between the stimu- 
lus as given and the normal W. At each level 
the error may be considered a function of 
an insufficient delay or control of a response 
tendency. If this analysis is correct, then Ss 
who make the error (reversers) should pro- 
duce fewer M responses than controls who do 
not make the error. Reversers should also be 
less able than controls with respect to the 
ability to inhibit an old association and rap- 
idly substitute a new one for it. 


Subjects 


The 274 Ss are veterans with a wide va- 
riety of psychiatric diagnoses, who had been 
referred for psychological testing in an out- 
patient setting. 


Procedures 


Ninety-eight reversers were selected from 
our research files on the basis of the appear- 
ance of one or more reversals of the mirror- 
image Ņ and the fact that the Rorschach test 
had also been administered. 

One hundred controls were selected by 
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choosing the next case, in alphabetical order 
to the reverser, who did not make the error 
and who had been administered the Rorschach 
test. The Rorschach scoring of the original 
examiner was accepted in all cases. Undoubt- 
edly, there is unreliability inherent in this 
procedure, but it is offset to some extent by 
the fact that most of the examiners in our 
clinic were trained at the same university and 
in largely the same clinical centers. 

To the group above were added 27 more 
Teversers and 49 more controls who had been 
administered a test of Cognitive inhibition, 
The procedure, Previously described by the 
authors (8, 10) briefly is as follows: A list of 
10 easy paired associates is read to S. After 
the associates are learned to a criterion of one 
perfect recitation, S is asked to respond, upon 
presentation of the stimulus word, with any 
word other than the learned associate, Cog- 
nitive inhibition time (CIT) is taken as the 
average time interval between presentation of 
the stimulus and the response for the 10 pairs. 
Since this time Presumably is taken up in 
part by the process of finding a new asso- 
ciation, a measure of word association time 
(WAT) is also obtained. WAT is computed 
as the mean response time to a list of 10 
other words taken from the same source as 
the original list (15). 

All tests, Rorschach, Wechsler-Bellevue, and 
Cognitive Inhibition were administered indi- 
vidually in the same clinical setting. 


Results 


The hypothesis that reversers will produce 
fewer M responses than controls is supported 


Table 1 


Percentage of Reversers and Controls Producing less 
than Two M Responses, Adjusted for 
Response Total (R) 


Reversers Controls 

Per- Per- 
R N centage N centage t $ 

Not 

Adjusted 125 66.4 149 49.7 2.78 .005 
<20 73 763 79 55.7 2.54 .02 
>20 52 53.8 70 27.1 3.00 .003 
>30 20 55.0 35 11.4 3.49 .0005 
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Table 2 
Mean Cognitive Inhibition (CIT) and Word Asso- 
ciation (WAT) Times, in Seconds, for 
Reversers and Controls 


Reversers Controls 
Test N Mean N Mean t p 
CIT 27 5.83 49 4.46 2.24 03 
WAT 26 2.79 49 2.48 77 = 


by the data (Table 1). A significantly greater 
proportion of reversers than controls produce 
less than 2 M. Two M was selected as 2 
breaking point since it is approximately the 
median M for our total clinic population. It 
has been suggested that differences in Ror- 
schach scores may be attributable to differ- 
ences in total number of responses (R) (4). 
There is a difference in R between the two 
groups in favor of controls significant at $ 
= .07. The two groups were therefore com- 
pared for M with some control for R. Under 
these conditions, reversers still produce less 
than 2 M significantly more frequently than 
controls. The difference actually becomes more 
pronounced as R increases. In the group ae 
ducing 30 or more responses, 55 per cent o 
reversers produced fewer than 2 M while only 
11 per cent of controls have less than 2 M. 

The hypothesis that reversers will be poorer 
on a test of cognitive inhibition is also sup- 
ported (Table 2). Reversers had a mean CIT 
of 5.8 seconds while controls had a mean time 
of 4.5 seconds. This difference is significant at 
Ż = .03. As found in previous studies (8, 10), 
the difference in inhibition scores is appar- 
ently not simply dependent upon associative 
facility. There is no significant difference be- 
tween reversers and controls in word associa- 
tion time. 

Since color responses are often held to be 
related to impulsiveness (3, 5, 12, 15) they 
were examined in this context. Reversers, sig- 
nificantly more often than controls, have a 
Sum C= 0. Only 18.8 per cent of the con- 
trols did not produce any color responses 
while 30.4 per cent of the reversers showed 
an absence of color. This difference was sig- 
nificant at p= .02. When this difference in 
color productivity was tested controlling for 
R, the difference in proportions producing no 


ty 


Inhibition Process, Movement Responses, and Intelligence 43 


color responses was significant only for the 
group producing more than 20 R. Under these 
circumstances it cannot be concluded that an 
absence of color responses is related to the 
reversal error independently of the depend- 
ence of Sum C on R. There are also no sig- 
nificant differences between reversers and con- 
trols in the production of FC, CF, or G; 
However, in all categories of color scoring, 
controls tend to be slightly more productive. 

The results were also considered in terms of 
the Erlebnistype ratio, M to Sum C. Four 
groups were formed on the basis of the me- 
dians of the distributions of M and Sum C. 
A significantly greater proportion of reversers 
than controls fell into the Low M — Low Sum 
C class. A significantly greater proportion of 
controls than reversers fall into the High M = 
High Sum C class. The examination of experi- 
ence balance demonstrates no new finding. It 
is not unlikely that the differences in Erleb- 
nistype are primarily due to the difference in 
M between reversers and controls. 

Controls, as a group, have a significantly 
higher IQ than do reversers. Reversers have 
a mean IQ of 100.76 (e = 13.25), controls a 
mean IQ of 109.46 (s = 15.14). This differ- 
ence yields a ¢ = 5.06 which is significant at 
p= 0001. The distribution of IQs for re- 
versers is a normal one and closely approxi- 
mates Wechsler’s norms. The distribution of 
IQs for controls is skewed * and contains a 
preponderance of high IQ subjects. About 33 
per cent of controls have an IQ of 120 or 
more while only 5 per cent of reversers have 


IQs of 120 or greater. 


Discussion 


The data confirm the analysis of the re- 
versal error as a manifestation of poor ability 
to inhibit or delay responses. Beyond this, 
there is considerable suggestion that adequate 
inhibition ability is an important factor in 
earning a high score on the intelligence test. 
Reversers had a significantly lower mean IQ 
than did controls and this difference may E 
be wholly attributed to artifact. Any single 

4 Because of the ie ma ana dapa 


median of the combined 
chi square = 39.56, 


a nonparametric te: 
two groups about the 
groups was attempted. For 1 dj, 
$ < 0001. 


could not have lost more than one weighted 
score point even if he had made the maximum 
number of errors possible. The one weighted 
point could hardly make a difference of more 
than one or two IQ points in any given in- 
stance. The obtained mean difference was al- 
most 9 IQ points. Since this is the case, it is 
reasonable to assume that other instances of 
lost IQ points were also due to failure of the 
delay mechanism. Qualitative analysis of per- 
formance on many of the subtests very likely 
would reveal other manifestations of poor in- 
hibition ability. 

There is a growing body of evidence to 
suggest inhibition ability involves a stable 
process in the person extending beyond the 
immediate stimulus situation. In terms of 
Wechsler-Bellevue performance, this would in- 
dicate there are processes in the person ex- 
tending across particular subtests. The diff- 
culty in validating “patterns” with the 
Wechsler-Bellevue (13) may very well have 
resulted from the attempt to impose arbi- 
trary meanings on the subtests in place of 
examining manifestations of definable ego 
processes involved in test performance. 

Rorschach (16) noted a relationship be- 
tween M and general intelligence and sev- 
eral studies since have reported significant 
correlations of M with various aspects of in- 
telligence (1, 2, 7, 24, 27). However, the 
rationale underlying such correlations is no- 
where clearly stated. The present findings 
support the rationale that both M and impor- 
tant aspects of intelligence test performance 
involve the delay function of the ego. Schul- 
man’s findings (17) relating M to abstraction 
ability and his suggestion that abstract think- 
ing reflects a delaying mechanism are in sup- 
port of such a view. Our present data reveal 
another specific aspect of intelligence test 
performance which seems to involve inhibi- 
tion ability, while WM has already been theo- 
retically and experimentally related to such a 
function. This hypothesis would direct us to 
seek further relationships between operation- 
ally defined and experimentally meaningful 
measures of ego functions and specific aspects 
of intelligence test performance. While one 
may look at this evidence as an approach to 
the so-called “nonintellective” factors of in- 
telligence, eventually it may be possible to re- 
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late concepts of intelligence and intelligent be- 
havior in a general theory of personality. 
The present findings also lend confirmation 
to previous studies (5, 8, 20) which have 
failed to support the relationship of color re- 
sponses to impulsiveness. Our Present data do 
not reliably support the conclusion that color 
Tesponses are related to the accurate execu- 
tion of the reversed N. The trends in the data 
suggested another hypothesis about color re- 
sponses which eliminates the necessity for 
tortuous reasoning relating color to emotion- 
ality. Both the color on the Rorschach cards 
and the mirror-image quality of the Wechsler 
symbol may be conceptualized as environ- 
mental features demanding attention. In the 
case of the Rorschach, the color either stands 
out as figure on the cards, or, in appearing 
after a long series of uncolored cards, it in- 
difference difficult to ig- 
nore. Similarly the instructions of the Wech- 
sler subtest explicitly require accuracy, and 
the mirror-image quality cannot be ignored 
without S failing this aspect of the task. From 
this point of view, the suggestive relationship 
between absence of color responses and writ- 


Two 
riety of 
ated on the basis of 


or control of 
pothesis was 
that reversers 


for it. The mean IQ of the reversers was sig- 
nificantly lower than the mean IQ of controls. 

The findings provide further evidence of 
the general significance of the inhibition proc- 
ess, as measured by M and by specific tasks, 
in that manifestations of the inhibition proc- 
ess can be identified in intelligence test per- 
formance. These data suggest it may be fruit- 
ful to attempt to subsume concepts of intelli- 
gence and intelligent behavior under more 
general personality theory. 
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Rorschach Response Characteristics as a Function 
of Color and Degree of Emotional Constriction! 


Arthur Canter 
Johns Hopkins University School of Medicine 


According to Rorschach test rationale the 
reaction to color is dependent upon the emo- 
tional control of the responder. The achro- 
matic-chromatic studies generally have not 
controlled the effects of this relationship. The 
purpose of this study was to attempt such a 
control to examine the role of color. 

A response-defined scale of emotional con- 
striction (EC) was constructed. From a large 
pool of college student subjects, 38 pairs were 
drawn, matched on the basis of their EC 
Scores, age, sex, and verbal ability. One mem- 
ber of each pair was assigned at random to 
either Group A or B, forming two groups of 
equal size having almost identical distribu- 
group was further 
subdivided into three levels of EC scores: 
high (Level I), moderate (Level II), and 
subjects were given 


The following hypotheses were made: (a) 
n time), and F+% 
ve “color” cards would 
A and B with the great- 
€ Level III pairs of 
subjects. (b) The color scores for Group A 
1An extended r 
tained without ch 
Psychiatric Clinic, 
more 5, Md, or fo 
mentation Institute, Order Document No. 5088 from 
ADI Auxiliary Publications Project, Photoduplica- 
tion Service, Washington 25, 
Di G: i 
payable to Chief, 
f Congress, 


subjects would be related to their EC scores 
in accordance with the color-affect hypothe- 
sis. (c) If the R for the two Rorschach groups 
were found to be equivalent, the achromatic 
subjects would have more shading scores than 
the standard subjects. (d) This difference in 
(c) would disappear if the shading and color 
scores for Group A were combined in terms 
of form-dominance and compared with the 
shading scores of Group B. (e) The variances 
of the Rorschach scores would be higher 1m 
EC Level III than Level I. i 

Hypothesis (a) was tested by the Median 
Rank test for each EC level. No differences 
were found between Groups A and B on R 
and RT for each EC level. The analyses 0f 
F + % were also either not significant or con- 
tradictory leading to a rejection of hypothesis 
(a). No significant relationships between EC 
status and color scores were obtained. Ti 
Median test analysis of (c) and (d) yielde 
positive results. Group B gave more shading 
scores (p < .02) but this difference disap- 
peared when the shading and color scores for 
Group A were combined as predicted. This 
suggests that color and shading are used simi- 
larly. Hypothesis (e) was also supported with 
higher variances for the Rorschach scores on 
Level III on both types of Rorschach. Thus 
color is apparently not as important to the 
variability of Rorschach performance as emo- 
tional constriction per se. 

As found in other studies, the role of color 
and the color-affect hypothesis seem to have 
been overvalued for the Rorschach test. 

Brief Report. 
Received November 9, 1956. 
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Rorschach Scores as a Function of Four Factors’ 


Conrad Consalvi 
Okio State Reformatory 
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The purpose of this study was to evaluate 
the factorial composition of the Rorschach 
test in terms of intelligence and the formal 
processes involved in the scores. Several stud- 
ies have indicated that intelligence, verbal 
fluency, and productivity play significant 
roles in the occurrence of the scoring cate- 
gories (1, 2, 6, 12). It has also been sug- 
gested that Rorschach determinants may be 


organized according to the dominance or lack 


of dominance of form without regard to the 
differential use of the stimulus qualities of 
the inkblots (9, 11, 14). The data of the 
factorial studies have had limited generality 
because the samples used often represented 
restricted populations with respect to intelli- 
gence or behavioral characteristics. Since little 
effort has been made to distinguish between 
general intelligence and verbal ability in such 
studies, it is not possible to evaluate the ef- 
fects of the latter independent of general in- 


telligence. k , i 
The present study was designed to investi- 


gate the following major questions: 
1. Does verbal ability operate differentially 
from relatively culture-free general intelli- 


ence on the Rorschach test? , 
: 2. What are the relationships between the 


intelligence factors and the scoring categories 
of the Rorschach? 


1 This report is based on the data from a To 
thesis submitted by Conrad Consalvi to Vargi Á 
University under the direction of Dr. Arthur z er, 
then of the Vanderbilt faculty and Mr. Ri orris 
of George Peabody College for Teachers. T cio 
thors wish to express their appreciation to Mr. 
ris for his invaluable aid in the statistical design anı 


analysis of the data. 


3. What is the effect of combining Ror- 
schach scores in terms of form-dominance? 
Logical considerations as well as the findings 
of other studies suggest that bright and 
achromatic color might be combined and that 
shading scores not used as colors might be 
combined in terms of form-dominance. 

4. Finally, do the movement scores repre- 
sent a different factorial composition that 
would support the traditional views of sepa- 
rating them from each other as well as from 
the “external” determinants? 


Procedure 


The subjects were 45 adults (22 males and 
23 females) between the ages of 20 and 36, 
with a mean age of 27 years. The educational 
level ranged from the 6th grade to profes- 
sional college, with the median grade com- 
pleted as the 12th. Approximately 24 per cent 
had one or more years of college. Occupation- 
ally, the group was diverse, including house- 
wives, clerical workers, firemen, nurses, hos- 
pital aides, and mechanics in various trades. 
None of the subjects was a student at the 
time of testing. Included in the sample were 
six inmates from a local institution for the 
feebleminded. This was considered necessary 
to sample the lower end of the intelligence 
range. In no case was there evidence of be- 
havioral or central nervous system disorder. 
Except for the intellectual limitations of the 
institutional cases, the sample was consid- 
ered normal in the ordinary sense of the term 
rather than the psychiatric one in which nor- 
mality may represent an ideal of adjustment. 

Three tests were administered to each 
subject: the Raven’s Progressive Matrices 
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(1938), the Vocabulary test of the Wechsler- 
Bellevue Scale Form I, and the Rorschach 
test. The Raven’s test was chosen as a gen- 
eral nonverbal test having wide applicability 
and freedom from contamination of speed 
factors. The Vocabulary test was chosen as 
a measure of verbal ability which correlates 
highly with other verbal tests of the Wech- 
sler scale (10). Both intelligence tests were 
administered individually by the usual stand- 
ard procedures. The inquiry method for the 
Rorschach entailed the use of nonleading 
questions by the examiner to establish the 
determinant scores. No determinant was pre- 
sumed to be operative unless stated by the 
subject without suggestion from the examiner. 
All responses were scored after Klopfer (5). 

Fourteen scores were used for the factor 
analysis. These included the Raven’s test 
Score, the raw score of the Vocabulary test 
and the following Rorschach categories or 
combinations: W +, M +, FC + FC’, FM + 
m,A%,C+C' (includes CF and C'F), FK 
+ Fe+ Fk, D (includes Dd and d), #Con 
(number of content Categories, based on the 
schema of Phillips and Smith [8]), K +c+ 
k (includes KF, cF, and kF), F%, and R 
(number of responses). The omission of form- 
Tevel rating scores was deliberate in view of 
hair high degree of subjectivity and the con- 
lowersy centered about their use (4, 9). The 
the sd M scores represented plus values in 
achrobviously poor quality responses were 
Rorncluded. Less equivocal judgments were 
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involved in the decisions for poor responses 
as opposed to assigning various degrees of 
positiveness used in a form-level rating 
scheme. A basal rating of 1.0, using Klopfer’s 
criteria (5) would correspond to the plus 
values of M and W in the present study. 

The raw scores of all Rorschach variables 
were converted to T scores since many of 
their distributions were skewed. The vari- 
ables were intercorrelated by the Pearson 
product-moment formula and the resulting 
14 X 14 matrix factorized by the centroid 
method. The highest correlation in each col- 
umn was used as the communality value, 
with replacement for each factor extracted 
(7). The centroid factors were subjected to 
two series of orthogonal, single plane rota- 
tions. Rotation was stopped when it became 
apparent that continuing would not improve 
the meaningfulness of the factors. 


Results and Discussion 


The original matrix of intercorrelations is 
given in Table 1. The centroid factor loadings 
and the rotated loadings are presented in 
Tables 2 and 3. Four factors were extracted 
with their compositions identified in Table 3- 


Loadings of .300 or higher are interpreted a5 
significant. 


Factor I' 


Factor I’, with nine variables, clearly seems 
to be the intelligence factor in this analysis, 
It has positive loadings on both Matrices an 


Table 1 
Original Matrix of Intercorrelations * 

Variable 1 2 3 4 5 6& 7 8 9 10 11 12 13 14 
1. Matrices 713 288 468 190 207 —377 —063 583 133 335 093 —442 210 
2. Vocabulary 259 420 445 249 —328 131 500 303 439 117 —484 375 
3. W+ 529 198 420 —438 278 292 025 509 224 —459 512 
aM 223 467 —417 —036 260 240 460 —088 —542 414 
5. FC+FC 191 —288 491 425 515 535 047 44g. 585 
6. FM+m =332 101 221 436 471 072 600 555 
7. A% —455 —280 —081 —673 —176 526 —419 
8. C+C 152 136 570 443 —418 484 
9. FK4Fc+Fk 386 351 —022 —306 411 
10. D 549 —166 000 748 
11. #Con 258 —433 819 
12. K+c+k —269 178 
13. F% —344 
14. R 


* Correlations of 


-296 are significant at the .05 level (two-tailed test). 


a 


Rorschach Scores as a Function of Four Factors 49 


Table 2 
Centroid Factor Loadings 

Variable I 1 m w E 
1. Matrices 559 —539 —34$ —230 777 
2. Vocabulary 653 —48 —1l4 —247 701 
3. W+ 591 185 —287 197 505 
4. M+ 578 —183 —329 353 600 
5. FC+FC’ 616 —021 395 —236 592 
6. FM-+em 581 037 —138 396 515 
7. A% —634 —207 184 055 482 
8. C+C’ 492 598 236 —39%6 812 
9. FK+Fce+Fk 574 —371 107 —190 515 
10. D 493 —178 572 349 724 
11. # Con 861 202 182 117 829 
12. K+c+k 244 401 —159 —257 312 
13. F% —702 —162 443 215 762 
14. R si9 152 405 305 951 

Table 3 
Rotated Factor Loadings 

Variable r Ir’ Ill’ IV hk 
. Matrices 882 —001 —001 —001 778 
z EA 798 080 195 —143 702 
3. W+ 323 320 335 432 506 
4. M+ 516 —034 345 463 601 
5. FC+FC’ 307 302 506 —384 589 
6. FM+m 297 079 498 415 515 
7. A% —362 —466 —300 —204 480 
8. C+C —044 812 316 —218 809 
9. FK+FcHFk 599 044 304 —248 515 
10, D 105 —193 800 —183 722 
11, #Con 320 415 742 078 831 
12. K-+e+k 038 552 —020 075 312 
13. F% —576 —581 —114 —278 760 
14. R 187 000 956 O24 949 


Vocabulary. Both variables show no signifi- 
cant loadings on any of the other a 
This supports the findings of Williams an 
Lawrence (11). The Vocabulary score corre- 
lates or fails to correlate with the Rorschach 
variables to about the same degree as does 
the Matrices with the exception of FC + re 
Although FC + FC’ correlates higher witl 
Vocabulary than with Matrices, the magni- 
tudes of the correlations are too low to be o 
practical value. Thus no support can be given 
to the assertion that verbal ability opea 
independently of general intelligence to altec 
Rorschach scores. 
The Rorschach 
loading on the innt 
form-dominant shadi 


variable having the highest 
elligence factor was the 
ng category (FK + Fe 


+ Fk). It would seem that interpreting three 
dimensions or well-configured texture from 
shades of gray requires more “abstracting” 
ability than using form alone or colored 
shapes without such further elaboration. Thus 
interpretations which take into account the 
intellectual factors in the use of FK, etc., 
would have support. The loadings of M + 
and W + are consistent with traditional 
views about their relationship to intelligence. 
The use of the individual scores to predict 
intelligence appears to be of doubtful value 
since none of the correlations with the intel- 
ligence scores is sufficiently high. However 
combining the scores in a multiple regression 
equation seems to offer a useful approach. 
The multiple correlation between Matrices 
and the four variables showing positive load- 
ings in Factor I’ was determined as .675. The 
regression equation derived from the data 
was: Zm = — .06 2, + .36 Zə + .49 33 + .03 
34, Where Zm, 21, 22, 23, and 24 represent the 
standard scores for Matrices, W +, FK + Fc 
+ Fk, and #Con, respectively. 

The two variables having negative loadings 
on the intelligence factor, A% and F%, also 
lend support to traditional views. It appears 
that the dull individual may give a higher 
proportion of form-determined responses be- 
cause he is less able to analyze his percepts, 
rather than because the percepts differ ma- 
terially from those of brighter subjects. The 
concept of constriction, in the emotional 
sense, would not be necessary to account for 
the extremely high F% record in such cases. 
However it seems likely that if a bright sub- 
ject gives almost all Fs, factors other than 
intelligence are crucial. The concept of con- 
striction may be more appropriately used in 
such instances. These findings also tend to 
argue against viewing the extremely high F% 
records of subjects having below average in- 
telligence as reflecting a barrenness of per- 
sonality without extra-Rorschach evidence. 


Factor II’ 


Seven variables load on Factor II’ which 
may be tentatively designated as “low-form.” 
Other studies have proposed that low-form 
involves a lack of perceptual control (11, 13), 
which carries with it some interpretive con- 
notations which the present study was not 
designed to evaluate. The negative loading of 
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A% and F% are consistent with the low- 
ta Cla A The fact that FC + FC 
has a positive loading is inconsistent but it is 
relatively low (.302) and may represent an 
artifact. It might be considered to reflect the 
problem that is involved in the scoring of 
FC versus CF. In such instances, when both 
form and color are used, the nondirective in- 
quiry method requires that the subject indi- 
cate which is dominant. It is conceivable that 
there is a highly arbitrary solution in many 
instances with the direction of choice depend- 
ing upon factors unknown to the examiner. 
One method of resolving this problem may be 
found in the technique used by Baughman in 
which substitute cards are offered to the sub- 
ject to test the importance of the various 
stimulus qualities (3). 

The identification of a low-form factor 
supports the use of form-dominance as a di- 


ler into the tra- 
ditional units such as C, K, k, c, etc. The as- 


be consistent 
actual clinical practice (8, 9). 


Factor III’ 


i Factor IT’, with ten variables loading on 
it, seems to represent productivity. It has ap- 
peared in most factorial studies and R has 
been shown to correlate significantly with 
most Rorschach scores (9; 11, 13). The vari- 
ables having the highest lo 

‘Con, Serve to identify 


It is apparent that Productivity plays a 
greater role for the furm-color determinants 
than for the form-shading ones, The latter 
depend more upon intellectual factors which 
may partly account for their relative infre- 
quency. This is not as clear-cut in the case 
of the low-form color and shading responses, 
neither of which have loadings on the intelli- 
gence factor, Low-form color responses are 
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also more influenced by productivity than are 
low-form shading responses. This differential 
effect of productivity upon color responses as 
contrasted to shading responses may be in- 
dicative of the accidental nature of the Ror- 
schach blots. It should be apparent that the 
total natural divisions (i.e., D and d) of the 
five colored cards is greater than that for the 
five achromatic cards. The distribution of 
the configurational divisions of the blots cre- 
ates limiting factors in the frequency poten- 
tial of color versus shading responses. The 
use of ratio scores of high to low dominance 
of form, such as FC: CF + C, offers compli- 
cations for interpretation not usually consid- 
ered since productivity and intelligence fac- 
tors play different roles for each variable. It 
is also apparent that correcting for produc- 
tivity by dividing the scores by R may not 
be equally effective as it is for F% and AJ, 
both of which seem fairly well controlled in 
this way. 


Factor IV' 


There are only four variables with signifi- 
cant loadings in Factor IV’. It may be identi- 
fied as Movement has been designated in 4 
Previous factorial study (11). The loading of 
W + might be taken as an indication that 
Sood Ms are frequently good Ws, and theif 
shared role in the intelligence factor should 
be noted. The composition of Factor IV’ ar- 
Sues against considering FM and M as rep- 
resenting fundamentally different appercep- 
tive processes. The only support for their 
separation as scores comes from the different 
loadings each has on intelligence. It appears 
that as one goes up the scale of intelligence, 
M increases at the expense of FM. The in- 
terpretation of FM as representing impulses 
for immediate gratification (5) may not be 
necessary to account for the changes in 
FM:M Proportions with age and sophistica- 
tion. The relative frequencies of H and A re- 
sponses also have to be considered, since each 
type of movement perception is generally de- 
pendent upon the occurrence of human or 
animal associations. One might profitably ex- 
amine what happens when individuals of dif- 
ferent levels of intelligence and mental age 
are faced with configuration so designed as 


to be highly provocative of animal or human 
features only. 
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The findings of an earlier study (13) re- 
porting a significant correlation between FC 
and M were not borne out. The negative load- 
ing of FC + FC’ in Factor IV’ offers some 
evidence for placing movement and color on 
Opposite poles, although the role of FC in 
the present study may have been distorted 
as already discussed. Whether Sum C is the 
most adequate way to represent the color 
pole was not tested and remains problematical 
(9). The interpretive meanings that would 
be useful in considering M and FM along 
similar dimensions also remains to be studied. 
One possibility suggested is that both M and 
FM represent the same factor at different 
levels of sophistication. 


Summary and Conclusions 


A group of behaviorally normal adults 
representing a broad range of intelligence, 
education, and occupation were given the 
Raven’s Progressive Matrices, the Vocabulary 
test of the Wechsler-Bellevue Scale Form I, 
and the Rorschach test. The scores were sub- 
jected to a factor analysis resulting in the 
extraction of four factors identified as: intel- 
ligence, productivity, low-form, and move- 
ment. The evaluation of the factorial com- 
positions suggests the following: 

1. Both the verbal and nonverbal meas- 
ures of intelligence affect Rorschach scores 
in a similar manner and help identify a single 
general intelligence factor in the Rorschach. 
A correlation based on the multiple regres- 
sion equation between the Matrices scores 
and four Rorschach categories (M +, W +, 
FK + Fc + Fk, and number of content cate- 
gories) yields a value which accounts for 
about 45 per cent of their common variance. 
This provides a limited but potentially use- 
ful predictor of intelligence from the Ror- 
s 4 p 
py m is minimally related to in- 
telligence. This factor is more dependent upon 
the extent to which a subject uses part as op- 
posed to whole areas of the blots and other 
nonintellectual and situational variables. Cor- 
recting for the effects of productivity by di- 
viding by the number of responses is = 
equally satisfactory for all variables, althoug 
A and F scores seem adequately corrected in 


this manner. 


3. There is a factorial similarity among 
the high form-dominant color and shading 
scores as well as a unique factor of low form- 
dominance which includes both color and 
shading. This suggests that the traditional 
method of separating determinants into the 
various color and shading categories may be 
unnecessary. 

4. Movement may be regarded as a sepa- 
rate factor which includes both M and FM 
+ m within it. The chief differentiation be- 
tween the two major movement categories 
appears in the finding that M loaded on the 
intelligence factor while FM did not. 
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Pitfalls in Interpretation of Parental Symbolism 
in Rorschach Cards IV and VII 


Sol Charen 


Montgomery County Mental Health Clinic, Rockville, 


There is an accepted practice among Ror- 
schach workers to think of Cards IV and VII 
as symbolizing the father and mother figures 
respectively (1, 3, 4, 7, 8, 9). Thus, inter- 
pretation to these two cards becomes a mat- 
ter of treating responses made to them as in- 
dicative of the subject’s attitude to his par- 
ents when he was a child. The rationale for 
such interpretation is based on empirical evi- 
dence or theoretical reasoning. Bochner and 
Halpern state of Card IV, “The heavy male 
figure may Suggest the father or authority in 
general; this may be pleas: 
Its dark quality and overwhelming character 
are particularly distur 


arental authority is stil] 


mores an unsolved prob- 
lem’ 


(3, p. 81). And for Card VI, they add, 
even female figures 
‘dancing girls’), as well 
, light quality, give this 
ity, frequently with ma- 
(3, p. 82). 


card.” Nine chos 
TV as the father 


1 Now at 


Child Guidance Se 
Service Agen 


rvice, Jewish Social 
cy, Washington, D; Ç. 
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Maryland 1 


and ten Card X as the mother card. The re 
sults were statistically significant. It is of in- 
terest to note (in view of the statement by 
Bochner and Halpern about the disturbing 
quality of Card IV) that those who chose 
Cards IV and VII gave clinical evidence, ac- 
cording to Meer and Singer, that they were 
fond of their parents, 

To use Card IV as a symbol of the father 
or authority figure, on the basis of this re- 
search, would mean that twelve out of fifty 
times the interpretation would be justifiable. 
Thirty-eight out of fifty times the use of such 
symbolism would result in error, The same 
reasoning holds true for the hypothesis that 
Cards VII and X are mother-figure repre- 
sentations. f 

Rosen (11) repeated the experiment © 
Meer and Singer, using as subjects 180 un 
versity psychology students. While statis 
cally significant results were again obtaine 
for Cards IV and VII, there were such marked 
individual differences as to the symbolic mean- 
ings of all ten cards that gross errors coul 
be made if only these two cards were accepte! 
as universal symbols of parental figures. 

The third article in the literature dealing 
with this topic is that of Hirschstein and 
Rabin who recently claimed additional sub- 
Stantiation of the symbolic values of Cards 
IV and VII as a result of their study of 
adolescent delinquents. Their main hypothe- 
sis was “individuals who are adolescent de- 
linquents and in whose early background there 
were no significant mother or father figures 
would react more readily and more easily to 
these cards (IV and VII) than would a simi- 
lar group of delinquents who grew up in the 
standard family situation and who had the 
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Opportunity of acquiring concepts of father 
and mother figures, disturbing as they may 
be” (7). The hypothesis was verified. While 
response times were consistently slower to all 
cards for the delinquents with parents, only 
to Cards IV and VII were there statistically 
significant differences between them and the 
delinquents without parents. 

A theoretical criticism of the hypothesis 
made is that it is difficult to conceive of any 
person, unless he is schizophrenic, who does 
not have a parental surrogate of some sort. 
The mere fact that natural parents were lack- 
ing for the one group is not in itself proof to 
support the hypothesis investigated. The au- 
thors did not indicate whether deep analysis 
of these adolescents showed the one group 
lacking to a greater degree the identification 
or introjection of a parental figure. Ae 

Schachtel, in his Rorschach investigation of 
500 juvenile delinquents, reported that they 
differed from a matched group of nondelin- 
quents of the same socioeconomic class pri- 
marily in their attitude to authority. 

The most important consideration . . . (in deter- 
mining whether or not a boy was delinquent) was 
whether or not the boy showed much dependence 
on, and fear of, authority. The more such fear and 
dependency had become part of the character struc- 
ture and the commands and prohibitions of the sig- 
nificant authoritative adults had been internalized, 
the more likely the boy would not become delin- 
quent. Internalization of authority (or to use Freud’s 
terminology, formation of a strong super-ego) thus 
was the most decisive single factor or rather constel- 
lation of factors in making the judgments in delin- 
quency and non-delinquency (12, p. 148). 


In view of Schachtel’s hypothesis, which he 
substantiated in making only two errors in 
500 matchings, one can find no basis for 
Hirschstein and Rabin’s hypothesis. The sta- 
tistically significant differences to Cards IV 
and VII found in their research are proof of 
some dissimilarity between their two groups, 
but not necessarily of attitudes to parental 


figures. 7 
Theoretical Criticisms 
itici he assignment 
Roy Schafer’s criticisms of t g 
of fixed symbolic meanings to Rorschach 
cards are specifically applicable to this dis- 
Cussion. He states: 


Card VII... has been held to represent the 
mother figure, and all responses to this card to rep- 


resent therefore concepts and attitudes toward the 
mother figure... . The errors lie in reasoning (1) 
as if no adaptive and defensive ego functions stand 
between the stimulus and the deep dynamics of the 
individual, (2) as if there were no relatively neutral 
images available to the patient in his efforts to cope 
with the stimuli, (3) as if there could be only one 
dynamic meaning in the card or area in question, 
(4) as if a statistical trend is the same as a perfect 
correlation, and (5) as if all we have learned about 
personality-rooted ‘individual differences and percep- 
tual organizing principles were still unknown (13, 
p. 146). 


Rorschach himself revealed an intuitive 
ability to use dynamic symbolism, but not by 
one-to-one reasoning. Those who are inter- 
ested in a psychoanalytic method of deter- 
mining the patient’s attitude to the father 
figure should reread his article, “The appli- 
cation of the Form Interpretation test,” in 
order to see how careful and conscientious he 
was in determining that the midline responses 
symbolized the father figure for the particular 
patient he had tested (10, pp. 209-213). 

Freud himself, contrary to popular belief 
did not arbitrarily assign a fixed meaning to 
symbols. Even he, gifted as he was in em- 
pathic understanding, laid down a cardinal 
principle that the patient must furnish the 
meaning of his symbols. In writing of dream 
interpretation he stated: 


At the same time I must expressly warn the in- 
vestigator against overestimating the importance of 
symbols in the interpretation of dreams, restricting 
the work of dream-translation to the translation of 
symbols and neglecting the technique of utilizing the 
associations of the dreamer. The two techniques of 
dream-interpretation must supplement one another; 
practically, however, as well as theoretically, preced- 
ence is retained by the latter process which assigns 
the final significance to the utterance of the dreamer, 
while the symbol translations which we undertake 
play an auxiliary part (5, p. 247). 


Frieda Fromm-Reichmann’s comments on 
symbolism, while not intended for the Ror- 
schach, are equally pertinent. She stated: 


Their significance (symbols) definitely varies with 
the personality, the life-circumstances and the prob- 
lems of the dreamer. In one person’s dream, for in- 
stance, a snake may appear as a male symbol, while 
another dreamer may use a snake to express female 
shrewdness and seductiveness. Again a snake may be 
used by an archeologist to express the attributes of 
one or another of the multitude of male or female 
gods or goddesses whose total or partial embodiment 
is that of a snake (6, p. 165). 
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Freud and Fromm-Reichmann are quoted 
because they emphasize that any attempt to 
determine what lies behind a patient’s use of 
symbols must depend on the specific mean- 
ing of a symbol to kim. The assumption of 
a fixed symbolic meaning to any Rorschach 
card does not seem justified in my opinion 
either on an experimental or theoretical ba- 
sis, at the present time. 


Investigation of the Problem 


The procedure used by Rorschach and 
Oberholzer of joint investigation of Rorschach 
responses and Clinical psychoanalytic material 
is the ideal method of determining whether 
Cards IV and VII have specific meanings. 
This method of investigation seems not to 
have been reported upon in the literature de- 
spite claims of empirical validation (4, 9). 
There are obvious difficulties in this kind of 
collaboration since only a few Rorschach 
workers are fortunate in working with thera- 
pists who do deep analytical investigation. 

In his private Practice as a testing psy- 
chologist the writer administers the Rorschach 
in the usual manner with a free association 
followed by an inquiry period. The individual 
tested is then asked to “Pick out the card 
which is most like your mother,” and then, 
“Pick out the card which is most like your 
father,” after which he is asked the reasons 
for his choices. This kind of additional in- 
quiry was adopted on the hypothesis that a 
Tesponse given to the Rorschach cards must 
have its origin in the unconscious of the pa- 
tient. In terms of the premise of Meer and 
Singer and accepted trends in the Rorschach 
field the additional hypothesis can be tested 
that patients’ responses to the above questions 
would have their origin in the same uncon- 
scious motivation to a card which stirred un- 
conscious memories of parental images. 

With but few exceptions the questions were 
interpreted by patients to mean, “Which re- 
Sponse that you gave to the ‘Rorschach re- 
minds you of your parent.” 

The subjects accepted the questions as part 
of the testing routine. Those who asked for 
additional information were told to rely on 
their own interpretation of the questions. 
Over fifty successive adults were tested in 
this manner, Unfortunately, no records were 
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kept of biographical data, but since the re- 
search was based on the assumption of the 
universality of meaning to symbols this omis- 
sion is not important. All tested were of both 
sexes, white, and from all socioeconomic 
classes except relief groups. They were re- 
ferred for projective testing by private psy- 
chiatrists, generally after a first or second 
treatment interview. The referring requests 
were usually for evaluation of personality 
assets and potentiality for treatment. All were 
ambulatory, nonorganic, and included all 
diagnostic categories, but as of the time of 
testing were not in need of hospitalization. 


Results 


The list which follows gives an idea of 
areas selected, responses given, and in italici 
the reasons given by these patients yy A 
particular response or area reminded P 
parent. Numbers used to locate D and 
areas are those of Beck (2, ED- 30-35). 


Card I 
Area 
Father 
Ww A face with eyes. My father’s face. 


Mother 


Woman (D 4) held by two people. I et 
my mother dragged around by my 
aunts. h 

Woman’s body. My mother was heavy like 
this figure. 

ees ies My mother liked ý 
wear pretty clothes. a 

Woman. My mother was very neat 4 
kept herself very straight. . ke 

Form of a lady and hands out as if 10g 
ing for help. Doesn’t know where to t ir 
because her head isn’t there. My moth 
wasn’t capable of doing anything 
herself. 


wW 


D3 


D4 


Card II 
Father 


Medical illustration. My father is a doctor. 

Fire and smoke. Reminds me of a fire 
When I was a kid I was afraid the housé 
would burn down because my father 
smoked in bed and burned the be 
many a time. 

Bears. My mother used to call my father 
a bear. 

Teddy bears. My father is devoted to my 
young son who had a teddy bear. 

Two bears praying. My father is very re- 
ligious. 


D1 


pi 


D3 


a 
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Mother 
Two bears gossiping. My mother loved to 
talk. 
Bears. Nose of the bear reminds me of my 
mother. $ 


Father and Mother 


Circus bears kissing and fighting. Like my 
parents. 


Card III 


Father 
Two men bowing and polite. My father 
was a tall lanky person like these. 
Men in tuxedos. Very formal like my 


father. 
Two men. Because these are men. : 
Marionettes. My father was clever and did 


parlor tricks. $ 
Two men. My father had a long nose like 


these men. 


Mother 
Two women exercising. My mother always 
did because she was so afraid to get fat. 
Hands praying. Reminds me of my mother 


pleading. 
Card IV 


Father 

Clown. My father loved the circus. 
Bear skin. My father was an old bear. 
Something solid and impregnable 

sticky. Looks masculine and short. 
Phallic Symbol. Because it’s masculine. 
Looks like a skull. My father is dead. : 
Face. My father was a poet and wore his 

hair like that. 


and 


Mother 

ean face like my mother. 
her had black curly hair. 
Like my mother. 
her control. 


Dragon, Has a m 
Hair wig. My moth 
Something enveloping me. . 

She used to stifle me with 


Card V 


Father 
Bat. Dull and not much of a picture and 
not much of a father. 


Mother 


Huge wings. Wants to protect me as my 


mother did. 
Card VI 


Father 


rded profile. My father was 


i bea 
Men eee 1 man although small 


a physically powerfu 


Ww 


wW 


D4 
D 6&7 


wW 


D 6&7 


Di 


and yet he got steamrollered by my 
mother, I saw him as an animal with 
powerful shoulders and no head. My 
father had no head and couldn’t stand 
up to my mother. 

Fur. My father liked to hunt. 


Mother 


Fox skin. My mother was a fox-like char- 
acter. 


Card VII 


Father 


Clouds. My father’s favorite hymn was 
“Unclouded Day.” 

Two arm up. My father loved to expostu- 
late. 


Growling animal. Like my father. 


Mother 


Iceberg. My mother came to this country 
from the old country in a ship over the 
water and saw icebergs. 

Old women gossiping. Because 
women. 

Two women dancing. My mother as a girl 
was thin waisted. 

Pixies. My mother is like a pixie. 

Two nude women. My mother isn’t nude. 
(sic!) 

Two women. My mother is a woman, 


Father and Mother 


Two lambs jumping. My parents are lambs. 

Rocks. My parents are dead; reminds me 
of cemetery tombstones. 

Church and two people. My parents passed 
on. 


they’re 


Card VIII 


Father 
No response. Triangles and squares and 
colors. Balance between the systematic 
and colors like my father. 
Beavers. My father worked very hard. 
Two animals. My father was stern and 
that’s how these animals look. 
Two animals. Animals are tame and my 
father was good and kind and nice. 
Warrior’s head. My father was strict. 
Blood. My father died of bleeding in his 
throat. 


Mother 
Flowers. My mother loved flowers. 
No response. Something soft and has color. 
Flowers. She liked flowers. (Given by two 
patients.) 
Gladioli. My mother loves gladioli. 
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Card IX 
Father 
Di Men. Like my father. : 
D3 Men laughing. My father used to drink and 
loved to run off his mouth. . 
No response. My father had auburn hair. 
Mother 
wW No response. Because its cheerful like my 
mother. All bright colors. 
Sun breaking through the clouds. Calm and 
peaceful like my mother. 
Tropical flowers. She loved beauty. 
Flowers. All are pastel colors and she likes 
flowers and is soft. 
Card X 
Father 
D4 Sea horses. My father and I loved to walk 
on the beach and find small dead ones. 
Mother 
Ww Flowers. Because my mother liked flowers. 


No response. Utter confusion and lack of 
coordination like my mother, but a pleas- 
ing appearance, 

No response, Pastel colors remind me of 
my mother because she is fair, 

Dds 29 Face, Scornful like my mother. 


Discussion 


The results speak for themselves, No claim 
made that there is a father or mother card, 
ather that, as Schafer points out, there are a 
variety of reasons against such an assump- 
tion, 
Summary 


This study is Concerned with the present- 
day tendency toward interpretation of Te- 
sponses to Cards IV and VII of the Ror- 


experimental evidence for such interpretations 
is sparse and there seems no convincing theo- 
retical proof for such assumptions. Further 
validation of determining whether such sym- 


bolic meaning did exist was attempted by the 
writer. He found that when over fifty pa- 
tients were asked to select Rorschach cards 
which reminded them of their own parents 
they tended to use all ten cards in such man- 
ner that no distinction between Cards IV and 
VII and the other eight cards could be made. 


Received May 8, 1956. 
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The data presented in this paper are based 
upon a Q array designed to quantify indi- 
viduals’ assessments of themselves and assess- 
ments of the individuals by clinicians. These 
assessments were intended for analysis in con- 
junction with physiological measures obtained 
on the same individuals (1). It was decided 
in advance that the construction of the rat- 
ing scale should incorporate certain essential 
characteristics as follows: (a) Have satisfac- 
tory interjudge and repeat reliability. (b) Ap- 
proximate as nearly as possible the kind of 
operations that clinical assessors customarily 
perform in the course of their professional 
duties. Involved in this was the consideration 
that clinicians aspire to make discriminat- 
ing judgments about both intra-individual 
characteristics and interindividual differences. 
Also involved is the desire of the clinician to 
make global and configural evaluations. (c) 
Provide a framework of discourse that would 
be equally useful to clinical assessors from 
different disciplines assessing from independ- 
ent heterogeneous samples of behavior. (d) 
Bridge the gap between the clinical assess- 
ment and the self-description, and provide a 
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basis for studying the nature of the relation- 
ships between other- and self-assessment. 

On the basis of these criteria, the Q array 
as described by Stephenson (9) was selected 
as the type of rating scale most appropriate. 
This paper will confine itself to describing 
two specific unanticipated problems which 
were noted after the Q array had been con- 
structed and data collection started. (a) The 
problem of the relationship between two vari- 
ables designated as Health-Sickness (emo- 
tional) and Social Desirability; (b) The 
problem of confounding, i.e., the difficulty in 
ascertaining the specific effects of these vari- 
ables upon other variables in an assessment. 
The first issue examined is some of the char- 
acteristics of moral and social value judgment 
which penetrate personality assessment. The 
second, the fact that it is necessary to sort 
these value judgments out in order to make 
valid analysis of data based on personality 


assessments. 
Method 


There were two contrasting groups of sub- 
jects: (a) 24 adult male patients hospitalized 
for psychiatric treatment, who will be re- 
ferred to as the sick group; (b) 24 male uni- 
versity students screened for absence of no- 
table current psychiatric difficulties, who will 
be referred to as the well group.* 

8 Well group was paid in connection with a study 
supported by a grant from the Air Force School of 
Aviation Medicine (1). 


d 

58 W. S. Kogan, R. Quinn, A. 
The array‘ consisted of 96 items which 
sampled 25 personality variables. Since the 
personality variables were used solely as 
scores, their names will not be listed. re 

For each assessment (both self and clini- 
cal) of each individual in either group, the 
following operations were performed: (a) 
The items, typed on cards, were sorted by 
the assessor in the following way. He was 
asked to describe the individual as best he 


array variable scores were obtained by aver- 
ging the scores of the several items sampling 
hem, (d) Thus each assessment was reduced 
to 25 scores. These 
used to elucidate the variables, Health-Sick- 
ness and Social Desirability, with which this 
Paper is concerned, 

The data presented are based upon four as- 
sessments of each individual in both groups. 
The assessments were performed in the order 
in which they are described: (a) First Self- 
Assessment, “describe yourself,” designated as 
(S1); (b) Clinical Assessment by a psychia- 
trist after one or more diagnostic interviews, 
designated as Work-up Assessment (WA); 
(c) Clinical Assessment by a psychiatrist 
other than the one who performed WA, after 
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Table 1 


Normal Distribution of the Q-Array Items 


Less More P 
characteristic characteristic 
Degree of 
Characteri- i 
zation 123 45 6 7 89 10 4 
Number of a 
Items 35°38 0 12 16 B mgs 


a one-hour “stress” interview,® designated as 
Stress Assessment (SA); (d) Second Self- 
Assessment, “Describe yourself as you think 
the doctor who interviewed you during the 
test thinks of you,” designated as Self-via- 
Doctor (Sv). 

The procedure Produced 192 assessments, 
four for each individual in the sick group an 
four for each in the well group. The next set 
of operations reduced these 192 assessments 
to eight mean assessments, an average for 
each of the four types of assessment for each 
group as described above. This provided us 
with the characteristics which the assessors 
attributed, on the average, to these tw? 
groups. 

Operations performed to derive each mean 
assessment were as follows: (a) Each of the 
25 scores for a particular assessment (€87 
First Self-Assessment [Sl] for patients) ve 
individually summed across all patients ar 
divided by 24, the number of subjects in eac 
group; (b) These mean scores are tabulate 
in their original order (25 scores in all) an 
are designated as Group Mean Assessment. 


Results 
Relationships Between Assessments 


Tables 2 and 3 show the intercorrelations 
(product moment) of Group Mean Assess- 
ments and provide a basis for the study of 
some of the characteristics of both the aes 
Sessors and the instrument of quantification, 
the Q array. The fact that the size and di- 
rection of the correlations are stable when the 

5 Subject, supine with a polygraph recording ten 
simultaneous Physiological measures, was interviewed 


on specific programed conflict areas by assessing psy- 
chiatrist. 


————— Cr 
ee en, 
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two successive self-assessments are compared 
and when the two independent clinical as- 
sessments are compared provides indication 
of the reliability of the relationships. 

Table 2, under the heading “Correlation,” 
presents the relationships between the Group 
Mean Self-Assessments (Sl and Sv) with 
Group Mean Clinical Assessments (SA and 
WA) for the two groups separately. The cor- 
relations in Table 2 under the heading “HS 
and SD removed” will be referred to in the 
text below. 

The data referred to as “Correlation” in 
Table 2 indicate that while the well group 
describes itself as it is described by clinicians, 
the sick group’s self-description bears no sig- 
nificant relationship to how they are de- 
scribed by clinicians. 

Table 3, under the columns labeled “Cor- 
relation” presents the relationship between 
each of the Group Mean Self-Assessments of 
the sick group with those of the well group. 
These correlations indicate that there is con- 
siderable agreement between the way the sick 
and well groups describe themselves. 

Similarly, Table 3 also presents the cor- 
relations between the Group Mean Clinical 
Assessments of the sick group with the well 


Table 2 


Correlations* Between Each Group’s Clinical 
and Self-Assessment 


Clinical assessments 


Partial 
correlation 
HS &SD 
Correlation removed 
grap o o O O e a 
Sick Group 
Self-Assessment z ri 
04 —.01 k d 
Be —.02 —.06 56 61 
Well Group 
Self-Assessment r Š 
.88 .78 4 d 
s .92 78 3 62 


mi ignificantly 
ar i ent tables, r = .40 and .51 significar 
different 3a uber and 1% levels respectively. N S, 
= 23, two-tailed test. 


Table 3 


Correlations of Self-Assessment and Clinical 
Assessment Between Groups 


Well group 
Partial 
correlation 
Assessment and HS & SD 
group k Correlation removed 
Self-Assessment Sl Sv Sl Sv 
Sick Group 
Sl ed 76 -66 60 
Sv 84 84 -69 67 
Clinical Assessment SA WA SA WA 
Sick Group 
SA —.19 —.03 59 -65 
WA —.18 —.01 73 62 


group. These correlations indicate that the 
average Clinical assessment of the sick group 
bears no significant relationship to the aver- 
age clinical assessment of the well group. 
The pattern of these correlations suggests 
that the sick group has less “insight” than 
the well group. The question still remains— 
insight with respect to what? With this ques- 
tion in mind, the work of Edwards (2) and 
others (5, 7, 8) with the problem of social 
desirability and its effect on the probability 
of endorsement of personality inventory items 
directed our attention to the effect of social 
desirability on our assessments. The articles 
document the strong trend on the part of an 
individual to ascribe behavior or attributes 
to himself which are socially desirable. This 
trend operated independently of whether 
other aspects of the behavior or attitude were 
accurate for the individual involved. 


Relationship Between Assessments and SD 
and SH 


In keeping with the nature of the groups, 
i.e., well-sick, we also became interested in 
exploring the effect of the variable of Health- 
Sickness upon the Q-array assessments. This 
further analysis led to some interesting re- 
sults. Measures of Social Desirability desig- 
nated SD and Health-Sickness designated 
HS, were constructed as follows: (a) Six 
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Table 4 
Correlation* of HS and SD with Group 
Mean Assessments 
Self Clinical 
assessment assessment 
Group and z x 
See Sl Sv SA WA 
Sick Group Ss 
S 59 -68 —.53 —.5§ 
SD 67 71 —.45 —.54 
Well Group 
HS 90 87 31 65 
SD 85 82 76 


were also significantly different from zero at the | 
confidence, For example, for the sick group the following 


clinicians (psychiatrists and psychologists € 
were asked to sort the Q array with respect 
to these variables, Health-Sickness and So- 
cial Desirability, as indicated in Table 1. 
Health and high social desirability were 
sorted at the high end of the continuum; 
(b) These two sets of sorts were reduced to 
25 scores as described above. 


The intercorrelations between the Q sorts 


Health-Sickness (HS) and Mean Social De- 
sirability (SD). These operations, in effect, 
weighted each of the 25 scores in the Q array 
for HS and SD respectively, The correlation 
between HS and SD was .89. 

Table 4 gives the correlations separately be- 
tween HS, SD, and the various Mean Group 
Assessments and indicates the high degree to 
which these two dimensions entered into the 
assessments. It should be noted that these re- 
lationships are true of both self and clinical 
assessments and raise some difficult problems 


ĉ These raters included individuals who partici- 
pated in assessing subjects for the study. 
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about assessment in general. To what extent 
are assessors, both self and observer, involved 
in what might be called cultural stereotyp- 
ing? In other words, to what extent do so- 
cial desirability and health-sickness determine 
the variance in assessments? Table 4 suggests 
that in our Q array too much of the variance 
derives from these dimensions and insuffi- 
cient amounts are being determined by vari- 
ables which describe personality; in other 
words, assessments may be based more on 


cultural stereotypes than on factors related 


to kinds of people. 


Partialling Out Effect of HS and SD 

The second problem is that of confounding: 
Edwards and Horst 4 
desirability as a variable in Q-array assess 
ment studies have demonstrated mathema” 
cally that the interpretation of the results w 
tained from Q arrays would be much plenm 
if this variable were controlled. The oped 
tions performed below are an attempt to E 
move the effects of HS and SD from the “a 
relations already referred to in Tables 2 a 
3, and demonstrate with the data at hand n 
issues raised mathematically by Edwards an 
Horst. The effect of HS and SD upon the S a 
relations in these tables was removed by by 
method of partial correlations as given a 
Garrett (6, pp. 433-434) for the problem i f 
four variables. In terms of the subjects use” 


(3) in discussing socia 


` 3 re | 
it was reasoned that HS is the pivotal va 


ne 
able and it was therefore partialled first. T 


$ Ja 
reverse procedure was omitted. The cor 
tions presented in Tables 2 and 3 inoen 
appropriate heading, are those resulting 


A 
partialling out both HS and SD. The corre! 


tions produced by partialling out only 


n jth | 
were of the same order of magnitude as W! 


i f- 
both partialled out. For example: With re’ 


n 
erence to the data for the sick group i 
Table 2, 


e 
tween self and clinical sorts with HS alo” 


e% 
partialled out is .54; with reference to th 


well group in Table 2, the mean partial eo 
relation between self-sorts and clinical sor Š 
is .63; for Table 3, the mean partial O 
lation between the two sets of self-sorts wit 

just HS partialled out is .68; the mean par- 
tial correlation between the clinical sorts with 
just HS partialled out is .55. These facts to- 


. . e |. 
the mean partial correlation P y 


. 
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gether with the high correlation between HS 
and SD would tend to substantiate the po- 
Sition that HS and SD are essentially the 
same variable. 


Discussion 


From the foregoing and from an over-all 
examination of Tables 2 and 3, it is clear 
that with the removal of the effect of what 
might be usefully labeled as a cultural stereo- 
type, the pattern of correlations has changed 
distinctly.” Examination of Table 2 indi- 
Cates that for the sick group the partialling 
out of HS and SD from the correlations be- 
tween the self and clinical assessments, re- 
sults in a distinct shift. This shift from in- 
Significant correlations to correlations of .54 
to .61 is from absence of significant relation- 
ship to a relationship accounting for about 
30% of the residual variance. The shift for 
the well group is in the opposite direction; 
i.e., from correlations accounting for 70% of 
the variance to ones accounting for about 
40% of the residual variance. i 

Similarly, examination of Table 3 indicates 
that partialling out HS and SD from the cor- 
relation between the Self-Assessments of the 
two groups results in a small but consistent 
drop in the size of the correlations. The same 
operation for the correlations between the 
clinical assessments of the two groups re- 
sults in a shift from insignificant correlations 
to ones accounting for about 40% of the re- 
sidual variance. 

It would appear now, with the removal of 
the variance due to HS and SD, that both 
self-assessments and clinical assessments ac- 
count for approximately the same amount of 
residual variance for the two „contrasting 
‘groups. Within the limitations imposed by 
the size of these correlations, an interesting 

. hypothesis suggests itself; i.e., that with HS 
and SD controlled, self-assessments might be 
substituted for clinical assessments in some 
respects. It would be necessary, of course, to 
ascertain specifically what accounts for the 


Overlapping variance. . , 
Again, within the limitations imposed by 


TWe are unable to state the level of significance 
Of the shift in the absence (to us) of a known cn 
mula to test the significance of the difference be- 
tween these correlations. 


the size of the residual correlations, another 
hypothesis that might be entertained is that, 
on the whole, sick and well groups both con- 
tain overlapping or like “kinds” of people, 
that both groups have within them the same 
spectrum of characterological types, the prin- 
cipal differences between them being that they 
are sick and well. The proposition might be 
interesting with respect to the general theo- 
retical notions of the continuity of health 
and sickness as opposed to the idea of treat- 
ing them as discontinuous. 

It is not within the scope of this paper or 
the data on which it is based to examine these 
two hypotheses in greater detail, much less to 
test them. They are presented to accent the 
shift in the pattern of correlations obtained 
and concomitant potential difference in inter- 
pretation of results. However, even if the data 
were capable of these two goals, the opera- 
tions of partial correlation are quite cumber- 
some and inefficient. The Q array, to be a 
useful and efficient technique for the quantifi- 
cation of clinical assessments, must be ca- 
pable of controlling and giving directly scores 
referable to variables such as HS and SD, as 
well as other variables which are under study 
at the time. All relevant variables, if they are 
to be manipulated successfully, must be con- 
trolled at the point of experimental design. 
This issue has been discussed by Stephenson 
(9) in the construction and use of Q arrays 
but not with reference to what we have de- 
scribed in this paper as “cultural stereotypes.” 
Fordyce (4) has explored one type of solu- 
tion with reference to social desirability and 
a limited number of other variables. 


Summary 


1. A statement of the desirable character- 
istics of an instrument for quantifying clini- 
cal assessments is given. A description of the 
construction of such an instrument is out- 
lined. However, rather than evaluating the 
instrument in terms of its original intention, 
this paper confines itself to two problems 
which emerged as the data were being col- 
lected: (a) The degree to which total vari- 
ance of group comparisons was dominated by 
what appeared to be a single variable. (b) 
The problem of sorting out the variance at- 
tributable to this variable from others being 
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measured so they could be interpreted cor- 
rectly. 

2. The variable of Health-Sickness defined 
by the operations described in this paper 
seems indistinguishable at this point from 
the characteristic attribute of self-assessors 
described in several papers as social desir- 
ability. 

3. This variable in Q arrays should be con- 
trolled within the structure of the Q array. 
Failure to do so obscures the interrelationship 
of what we have described as “cultural stereo- 
types” and other variables in assessments 
within the particular problem studied. Sta- 
tistical separation of these variables is pos- 
sible but cumbersome and inefficient. 


Received April 30, 1956. 
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A Consttuct Validation 


of the Edwards Personal 


Preference Schedule with Respect to Dependency’ 


Alfred C. Bernardin and Richard Jessor 
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Recent emphasis upon construct validity 
(5, 8, 10) in psychological tests has had 
several beneficial effects. It has removed the 
“criterion” from a position of unquestioned 
validity in comparison with the “test”; it 
has tended to strengthen the relationship be- 
tween tests and theories; and finally, by re- 
quiring coordination of both test and criterion 
to the construct, it has emphasized careful 
specification of the components or properties 
of test constructs. The latter, especially, has 
immediate implications for experimental at- 
tempts to validate a test. 

The present investigation grew out of an 
interest in defining the construct of depend- 
ency in such a way as to be useful in relating 
behavior on a psychometric personality test 
to behavior in several experimental situations 
developed in light of the properties of the 
dependency construct. Such relationships, if 
found, would constitute one source of evi- 
dence for the construct validity of the test or 
an aspect of the test. h ; 

A review of the literature indicated consid- 
erable agreement upon what is meant by de- 
Pendency. Three components were incorpo- 
rated into the present definition on the basis 
of this review and a consideration of available 
Personality tests. Dependency was defined F 
including: (a) reliance on others for approva 
or importance of approval from others, (b) 
reliance on others for help or assistance, and 
(c) conformity to opinions and demands of 
Others. 

The instrument employed in this 
Was the Edwards Personal Preferenc 


1 This article is a report of a Masters Thesis (2) 
completed by the first author under the supervision 
of the second author. 


research 
e Sched- 


ule (PPS) (6), a recently devised inventory 
purporting to measure 15 personality needs 
originating from the work of H. A. Murray 
(9). The inventory has several desirable char- 
acteristics—it attempts to measure normal 
personality variables, it employs a forced- 
choice item form, and it has successfully 
minimized the role of social desirability in 
item choice (6, pp. 14-16). The PPS does 
not include dependency as one of the vari- 
ables measured, but two of the variables in- 
cluded in the inventory appeared related to 
our definition of dependency. These two vari- 
ables are deference and autonomy, and they 
are defined by Edwards (6, p. 5) as follows: 

deference: To get suggestions from others, to find 
out what others think, to follow instructions and do 
what is expected, to praise others, to tell others that 
they have done a good job, to accept the leadership 
of others, to read about great men, to conform to 
custom and avoid the unconventional, to let others 
make decisions, 

autonomy: To be able to come and go as desired, 
to say what one thinks about things, to be independ- 
ent of others in making decisions, to feel free to do 
what one wants, to do things that are unconven- 
tional, to avoid situations where one is expected to 
conform, to do things without regard to what others 
may think, to criticize those in positions of au- 
thority, to avoid responsibilities and obligations. 


Although it would have been possible to 
utilize other PPS variables, e.g., succorance, 
it was decided to select Ss only in terms of 
these two. For purposes of the research, Ss 
were Classified as dependent if they scored at 
or above the 70th percentile on deference and 
at or below the 50th percentile on autonomy, 
with a minimum separation of 30 percentile 
points between the deference and autonomy 
scores for each S. The Ss were classified as 
independent if they scored at or above the 
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70th percentile on autonomy and at or below 
the 50th percentile on deference, with a mini- 
mum separation of 30 percentile points be- 
tween the deference and autonomy scores.? 

From an administration of the PPS to 520 
students at the University of Colorado, 55 
dependent and 55 independent Ss were se- 
lected. The mean separation between their 
scores on the two variables was approxi- 
mately 63 percentile points. From these 110 
Ss, persons were assigned at random to one 
of three experiments. 

In summary, then, this Paper reports an 
attempt to validate experimentally the con- 
struct of dependency as a variable in per- 
formance on the PPS, 


Experiment 1 


Problem. The hypothesis of this study was 
that dependent persons under conditions of 
negative verbal reinforcement (critical com- 
ments) would perform less efficiently on a 
finger-maze learning task than either inde- 
pendents under the same conditions or con- 


trol groups of dependents and independents 
receiving no ne 


The reasoning 


Persons, as defined, are More reliant on others 
for approval or consider approval from others 


etc. 


i 2 All Ss in the study were required to have a con- 
sistency score (6, p. 6) on the PPS of at least 10. 
This is a score which enables evaluation of whether 
an S is responding to the items consistently or on a 
chance basis. The probability of a Score of 10 or 
better by chance alone is .15. The Percentiles are 
based on Edwards’ norms reported in (6). 


Alfred C. Bernardin and Richard Jessor 


Subjects and procedure. Forty persons, 20 
dependents and 20 independents, were uti- 
lized. Ten dependents and 10 independents 
were randomly selected for the experimen 
condition (negative verbal reinforcement); 
and the remaining 10 dependents and 10 in- 
dependents were designated as control Ss. 


Each S was individually brought into the experi- 
mental room where he was seated and informed that 
he would be asked to learn a finger maze. The maz 
was constructed of raised welding rod fastened to 4 
wooden base and consisted of 20 choice-points be- 
tween start and end. S was blindfolded and allow? 
to explore the maze from the starting point to the 
first choice-point for a period not exceeding twenty 
seconds. Instructions were read to him stating that 
he would have fifteen minutes to learn the finger 
maze and that the experimenter would keep track 0 
both the number of errors which he made and 
number of perfect runs through the maze. A stoP 
watch was started and S’s progress through 
maze was recorded by an assistant. When either y 
maze had been learned to criterion of three consect 
tive perfect runs or when fifteen minutes had elapse 
the S was stopped. All 40 Ss were run through the 
procedure in Experiment 1 but only the 20 Ss in 
experimental condition were given negative reinfor 
ments (critical verbal comments) during the exper 
mental period. At intervals Æ made such comme? 
as: “You're going very slowly,” “Your performa” 
is not very good,” “I thought you could do mi 
better than this,” etc. In addition, Æ said EN ing 
cach time S made a blind alley entrance or retrac! 
error. The 20 control subjects received no negati 
reinforcement while learning the maze. 


the 


& T To- Ga ca 
a a a 


S 
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ce 
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In order to assess quality of performant 
on the maze task, mean errors per run, mea 
time per run, and percentage of savings wa 
calculated for each S. The S might ma?” 
errors in three ways: (a) a blind-alley a 
trance, (b) retracing the same path, and 
vacillation—either standing still in one Pi 
on the maze or movements back and fof 
on the paths around a choice-point (at an 
given choice-point where vacillation had 0 
curred one error was recorded). Errors pe 
run and time per run were computed on th 
basis of all runs subsequent to the first TU” 
First-run error scores could be due largely i 
chance, since no Ss had prior experience with 
this particular maze. In none of the three €% 
periments reported here did Æ have knowl 
edge of whether a particular S was a depend- 
ent or independent. 

Results. Means and standard deviations oD , 
each of the three measures of quality of 


a 


er 
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Table 1 


Means and Standard Deviations for Dependent and Independent Experimental and Control Groups 
on Three Measures of Quality of Performance 


Dependent Independent 
Measure 

Experimental Control Experimental Control 
E Mean 2.598 1.826 1.540 1.600 
rrors/run SD 658 645 1.100 576 
Ti Mean .981 -709 -762 .739 
ime/run SD 781 523 134 .660 

à Mean 67.56 84.53 80.47 86.80 

% savings SD 4.36 5.49 3.33 5.23 


Performance for each of our four groups— 
dependent experimental, independent experi- 
mental, dependent control, and independent 
control—are given in Table 1. 

Since the variance among groups on two of 
the three measures is heterogeneous, a non- 
parametric statistic, the Kruskal-Wallis H test 
(7), was employed to evaluate the data. An 
over-all H test was run for each of the three 
measures. Since the over-all H values were 
significant in each case, it was possible to use 
the H test to compare the four groups within 
each measure. The H values for the relevant 
cell comparisons are presented in Table 2. 

An examination of Table 2 indicates that 
the hypothesis for Experiment 1 is supported. 


Table 2 


H Tests Among Relevant Cells on Each of the Three 
Measures of Quality of Performance 


As 
pre- 
Measure Groups Hvalue $ level dicted 
DEvsIE 4.010 05 Yes 
DEvsDC 4.802 04 f Yes 
Errors/run “yRys{C 113 notsignif. Yes 
DC vs IC 568 notsignif. Yes 
Yes 
DE vs IE 5.139 .03 
Ti DEvsDC 6.219 -02 Yes 
mag/can IE vs IC 2.282 notsignif. Yes 
DC vs IC 820 notsignif. Yes 
DEvsIE 4318 05 Yes 
, BysDC 7.402 01 Yes 
% ieee TE = Ic 1.848 not signif. Yes 
DCvsIC 820 notsignif. Yes 


Dependent Ss under conditions of negative 
verbal reinforcement made significantly more 
errors per run, took significantly longer per 
run, and showed significantly less percentage 
of savings than independents under the same 
conditions. Quality of performance for de- 
pendent experimentals was significantly low- 
ered compared with dependent controls. No 
difference in quality of performance as a func- 
tion of negative reinforcement appeared be- 
tween independent experimentals and inde- 
pendent controls. 


Experiment 2 


Problem, The hypothesis of this study was 
that dependent persons confronted with a 
difficult problem-solving task will request help 
significantly more often than independent per- 
sons, when both groups are informed that as- 
sistance may be gotten upon request. This 
study was an attempt to elicit direct referents 
(requests for help) of one of the properties of 
the dependency construct (reliance on others 
for help) and contrasts with the indirect test 
of another property (reliance on others for 
approval) reported in the preceding study. 

Subjects and procedure. Twenty dependent 
Ss and 20 independent Ss constituted the two 
groups in this experiment. 

The task employed was a difficult Chinese block 
puzzle consisting of 11 pieces which when assembled 
formed a 23-inch cube. All Ss were told that they 
would be asked to solve a difficult block puzzle 
within a 15-minute period. They were further in- 
formed that, because of the unusual difficulty of the 
task, E would be willing to give them as much help 
as they felt they needed, and that whenever help was 
requested, Æ would place the next piece of the block 
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puzzle in place. All Ss worked for a 15-minute pe- 
riod, during which time E, by means of a concealed 
counting mechanism, kept track of two scores. These 
two scores are referred to as the suggestion score and 
the corroboration score. The suggestion score was the 
number of direct requests an S made that E put the 
next piece of the block puzzle in place. The sugges- 
tion score is, then, the record of the number of times 
an S directly requested help on the block-puzzle task. 
The corroboration score was the number of times an 
S made comments asking for reassurance, e.g., “Is 
this correct?” “Pye got it now, haven’t I?” etc, 
This score was considered an indirect measure of the 
tendency to rely on others for help. Only one S was 
able to complete the assembly of the puzzle within 
the allotted 15 minutes, and his data were discarded 
since it had been decided in advance to equate time 
spent on the puzzle. 


Results. Since the variances were heteroge- 
neous for both the suggestion-score and cor- 
roboration-score data, the results were evalu- 
ated by use of the H test. Table 3 indicates 
that on both scores the dependents, as pre- 
dicted, were significantly higher than the in- 
dependents, 

Another way of analyzing these data was 
was determined for 


was 10.0; the chi- 
roboration score 
values, for 1 degr 
are significant at 
of confidence, 


Table 3 


Differences Between Dependents and Independents in 
Both Suggestion and Corroboration Scores 


Suggestion Score Corroboration Score 


Inde- De- Inde- 


De- 
pendents pendents pendents 


pendents 


95 
1.24 


3.50 
2.31 


Experiment 3 


Problem. The hypothesis investigated in 
this study dealt with a third component of 
the dependency construct. It was hypothe- 
sized that in a situation requiring perceptual 
judgments to be made before a group, de 
pendent Ss will conform more to the judg- 
ments of the group than will independent Ss 

Subjects and procedure. F ifteen dependents 
and 15 independents were run through a pro- 
cedure very similar to that employed by Asch 
(1). 


The Ss were asked to judge whether, of two eo, 
drawn on a card, the left line was longer, shorter, a 
the same as the line on the right. Sixteen judgmen™* 
were obtained from each experimental S in a si 
tion with seven to nine other persons present, th ts 
other persons having announced their judemetg 
ahead of the experimental S on each of the Oa 
Since the rest of the group had been instructed of 
E to give objectively incorrect judgments for 13 ree 
the 16 cards, it was possible to evaluate the dee 
to which an experimental S$ conformed to the ju od 
ments of the group. Since the variable line cite 
in length from the 2-inch standard by $ to 4 voit 
Plus and minus, it was also possible to determi jo 
Whether dependents differed from independents, ed 
the line-length difference at which conformity £3)! 
to occur, 


Results. It was apparent by inspection that 
no differences on either measure of conform” 
appeared between the dependent and ple 
Pendent Ss, and it was therefore not possi pe 
to reject the null hypothesis, It should its 
noted, however, that Asch’s general resu b 
(1) were supported since approximately 60 
of the Ss—both dependents and independe” 
—exhibited conformity behavior. 


Discussion and Conclusions 


The aim of the present study was to a 
vestigate the degree to which a construct % \ 
dependency could mediate the relationship be 
tween test behavior and behavior in seve 4 
experimental situations. In this sense, the E 
search reported constitutes a construct valide 
tion approach to certain aspects of the PP¥: 
The results of the research are construed 2° 
Contributing to the construct validity of thé 
autonomy and deference scales of the PPS. 

Some of the Problems of the present re 
Search might well be noted. With respect t0 
Experiment 1 the hypothesis employed in’ | 


volves reliance on the concept of interfering 


Construct Validation of Edwards PPS 67 


Tesponses as the basis for the lowered quality 
of performance of dependents under negative 
verbal reinforcement. The study, however, 
Provides no direct measure of these interfer- 
Ing responses and their explanatory role re- 
mains completely hypothetical. It may, that 
is, be possible to account for the obtained re- 
sults within a different formulation than was 
employed here. 

Blindfolding the Ss may have, in addition 
to the negative comments, contributed to the 
lowering of performance level since it is pos- 
sible that dependent persons rely more heavily 
upon visual cues of social reactions than do 
independents. Elimination of visual cues may 
have been more disrupting for the dependents 
than for the independents. There is no way 
of analyzing out this effect which, of course, 
Would operate in the direction of the hy- 
Pothesis, 

It is possible to point to at least one major 
factor which may have been responsible for 
the negative results of Experiment 3. Unfor- 
tunately the situation was so constructed that 
Conformity to the group (which gave 13 out 
of 16 objectively incorrect judgments) was 
attainable only at the expense of disagreeing 
With a fairly objective reality situation. A 
less Structured situation where the correct re- 
SPonse is less apparent to the S might be suc- 
Cessful in differentiating dependents from in- 

€pendents. It is possible, for example, to 
ave the task consist of ambiguous colors, 
€g., blue-green, to be named. Another factor 
Which may have been involved was the status 
relation of the S to the group. Possibly the 
Use of group members of higher status than 
> @2. graduate students and instructors, 
Might have elicited differential conformity be- 
avior from the dependents and independents. 

ith respect to the over-all aim of the re- 
Search—the establishment of a useful con- 
Struct in relation to psychometric test behav- 
Ìor—it would be worth while to demonstrate 
that the three properties of dependency are 
Correlated within persons. It was simply not 
€asible to utilize the same S in several ex- 
Perimental situations. Further research in 
Which this is achieved would provide more 
direct evidence of the utility of the depend- 


€ncy construct as defined. 


Summary 


The present study is essentially a construct 
validation of certain aspects of the Edwards 
PPS related to the construct of dependency. 
Three properties of dependency were speci- 
fied—reliance on others for approval, reli- 
ance on others for help, and conformity to the 
opinions and demands of others. Persons were 
selected as depéndent who scored high on 
deference and low on autonomy on the PPS. 
Independents were defined by high scores on 
autonomy and low scores on deference. Three 
experiments were conducted, each one to 
measure a different property of dependency, 
and a total of 110 Ss were involved. The re- 
sults supported hypotheses relating to the 
greater reliance of dependents on others for 
approval and for help. No differences were 
found between dependents and independents 
in group conformity. 

On the whole the research serves to con- 
tribute to the construct validity of the au- 
tonomy and deference scales of the PPS and 
indicates the possible utility of the PPS for 
research studies in personality. 


Received May 18, 1956. 
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Performance of High 


School Students on the 


Edwards Personal Preference Schedule’ 


C. James Klett 


University of Washington 2 


The Edwards Personal Preference Schedule 
(EPPS) (3) purports to measure fifteen rela- 
tively independent normal personality vari- 
ables drawn from a list of manifest needs de- 
scribed by Murray (5). An important feature 
of the EPPS is the attempt to minimize the 
tendency of subjects to respond in the so- 
cially approved direction by pairing items 
pertaining to differing needs but having simi- 
lar social-desirability scale values and pre- 
senting them to the subject in a forced choice 
format. Edwards (3), in his normative work 
on a college population, found that the inter- 
correlations of the EPPS variables were, in 
general, quite low, and that the fifteen meas- 
ures demonstrated satisfactory split-half and 
test-retest reliability. 

Since the EPPS has potential application 
to groups other than the college population 
upon which it was standardized and is, in 
fact, already in use in a variety of applied 
settings such as mental hygiene clinics, hos- 
pitals, and industry, further normative work 
would seem required. This investigation ac- 
cumulated normative data applicable to a high 
school population while at the same time 
studying the operation of several other vari- 


1 This study was a portion of a doctoral disserta- 
tion completed at the University of Washington, 
1956. Thanks are extended to Dr. Allen L. Edwards, 
who was of invaluable assistance during all stages of 
this study. The public school officials of King County 
were generous in their assistance, and special thanks 
are due to Mr. Hofstetter, Leonard Johnson, 
Paul McCurdy. Shirley Klett, Michael Goldstein, 
William Crow, and Cliff Lundeborg assisted in judg- 
ing and calculations. Thanks are due to the Marchant 
Calculating Co. of Seattle for the loan of a calcu- 
lator. 


2 Now at VA Hospital, Northampton, Massachu- 
setts. 


and 


ables on EPPS performance. On the basis 0) 
previous research (1), a variable considere, 
to be of primary importance was that of 
cial-class membership. 


Method ; 


Subjects. The EPPS was administered y 
two King County high schools outside the 4 
of Seattle, Washington. High School A 
an enrollment of 568 students and v 
cated in an outlying town in the county a 
some 6,600 population. High School B a 3 
enrollment of approximately 1,850 stu tial 
and was located in an expanding reside 
suburban area of the city of Seattle. 

Of the total number of students in aV? 201. 
daily attendance in the two schools, pe 
were tested, yielding a total of ae ip 
plete and scorable records.* There be git? 
the normative sample, 188 boys and aie 695 
from High School A and 616 boys a” ë o 
girls from High School B. The age rate it) 
the group was from 14.5 to 20 years, peh 
the mean age of the boys 17.1 and that 5 oh 
girls 16.9. In grade placement, there w AG 
sophomores, 560 juniors, and 414 pet ipl 
most equally divided between boys an! Jab! 
within each grade. IQ scores * were ar ot 
for 1,521 subjects, and these ranged fro 


j 

3 During the major part of the IBM procesy 
five senior boys in High School A were inadver i 
omitted. Most of the normative data, then, 35 
on an NV of 1,633. 

* California Test of Mental Maturity. A c 
the subjects had only Otis Gamma scores, but @ te, 
relation of the Otis Gamma and the Calton 5 Y 
of Mental Maturity for 1,087 subjects oaa, y 
81 and was felt to be high enough to et 
establishment of conversion tables so that Califo? ọ 


i r 
equivalents could be obtained for the remainde 
the subjects. 
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Table 1 


Distribution of Socioeconomic Status (SES) of 
Subjects in the Normative Sample 


Frequency 
enapan 


Socioeconomic group 
Boys Girls Total 


1 Professional 77 71 148 
Toprictors, managers, officials 121 123 2t 
kill ical and kindred 111 117 2238 

cled 272 282 554 
emi-skilled 145 178 323 
Vnskilleq 5 44 96 
nemployed, welfare, relief 2% 9 8 
Total 304 $34 1638 


Fe 


t ; 
foil te with the mean for the boys at 107.2 
mack {0t the girls at 106.8. 
a descri ~ ndent supplied his age, grade, 
question lon of his father’s occupation on a 
e de naire accompanying the testing. From 
Sification peo of father’s occupation, clas- 
n according to socioeconomic status 
Oped h Was made utilizing the tables devel- 
Y the U. S. Bureau of the Census (2) 


and 


Mean; Table 2 
i ‘or Standard Deviations of the EPPS Variables 
£ the High School Normative Sample 


Standard 
Deviation 
Variable Mean evi 
1 Boys Girls Boys Girls 
2 Chievement 13.88" 11.13 401 406 
3, oraaa DE a Sa e 
£ Exhibiti 10.74 10.68 os 414 
$: ia Itionism 15.40* 14.93 3,51 3.38 
lia iage jigo 438 #20 
fli: : 5 
T ntra is28 179 ST 3 
8 eption 1 15.87* 449 4 
9. PUCO; 3.13 yi rc 
ane ilos 1276" 49 4 
10 a anänce 13.06% 1199 133 439 
2, ass ioe 459 20l 
1 Go 430 
iG nce i 4.58 
1s Change E ggo An Ais 
if Materoa 13.81" 11.96 s11 5410 
15. A Srosexuality 1731* 14.39 704 6.98 
session 13.88" 11.43 439 4.19 
Cone ; : 
y tency score 1081 11.68* 2.06 1.81 
799 834 


* 
tha Phi 
an tp Mean js a Jevel 
the corpa is significantly larger at the one per cent 
sponding mean for the opposite Sex- 


This classification provides for six categories, 
ranging from professional to unskilled work- 
ers. For the purposes of this study, a seventh 
category was also utilized to include welfare 
cases. The assignment of SES was made by a 
panel of three judges. Reclassification of 100 
randomly selected cases by two independent 
judges yielded reliabilities of .93 and .92. 
The distribution of assigned SES appears in 
Table 1. 

Administration. Administration was carried 
out during regular class periods by the class- 
room teacher, the testing program extending 
over two days as some of the classes were too 
short to allow all to finish. Two class periods 
proved more than ample for completion of the 
test. Absentees on either of the two days ac- 
counted for the majority of the unusable 


records. 
Results 


The data were treated separately by sex 
and by high school. Comparison of the means 


Table 3 
of Means of EPPS Variables in High 


ison 
Cca (Œ.S.) and College Normative Groups 
Mean 
Variable Boys Girls 
2S, 
H.S. College H.S. College 
~ ment 1388 15.66% 11.13 13.08" 
3 pasate 1138 1121 11.81 12.40* 
3, Order 1074 10.23 10.68 10.24 
‘4, Exhibitionism 1540* 14.40  14.93* 14.28 
5, Autonomy 14.57 14.34 11.89 12.29 
jati 15.28 15.00  17.94* 17.40 
t 

3 option 1313 16.12* 15.87 17.32* 
8. Succorance 1103 10.74 12.76 12.53 
9. Dominance 13.96 1744" 11.99 14.18* 
10. Abasement 1435* 12.24  17.66* 15.11 
14.12 14.04 17.35% 16.42 
A ed 17.12" 15.51 18.09* 17.20 
1% Endurance i3git 1266 11.96 12.63* 
13. eterosexuality 17.31 17.65 14.39 14.34 
1 aggression 1388" 1279 11.43" 10.59 
i 40,81 11.53" 11.68 11.74 
Consistency score 709760 ee 


Se 
Jete college norms may be found in Edwards 


Note.—Comp! 
EN ae ei tly larger at the one per cent level 
( > This mean is nea for the other normative group, 
than the = 
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variables, autonomy and dominance, were 
found to be significantly related to socioeco- 
nomic status when IQ was controlled in an 
analysis of variance design. Significant corre- 
lations were obtained between socioeconomic 
status and other EPPS variables, but the co- 
efficients are low enough to justify the exclu- 
sion of this variable from practical considera- 
tion in the interpretation of the EPPS scores. 

The significant differences between various 
groups on EPPS scores lend considerable face 
validity to the needs as defined by Edwards. 


Received September 5, 1956. 
Early Publication. 
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Simulation of “Normalcy” by Psychiatric Patients 
on the MMPI’ 


Harry M. Grayson 
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Studies on simulation of normal or abnor- 
mal adjustment on psychological tests have 
been conducted by a number of techniques 
including the Rorschach (1, 2, 4, 11), sen- 
tence completion test (9); and objective per- 
sonality schedules (3, 6, 7, 8, 10). However, 
studies on simulation of “normalcy” by psy- 
chiatric patients are, to our knowledge, non- 
existent. Such studies would appear to be 
justified by the following considerations: they 
may throw light on the concepts of normal 
adjustment which are held by different kinds 
of psychiatric patients; they may yield im- 
portant implications for psychotherapy, based 
upon the nature of the differences between 
the patient’s self concept and his ego ideal, 
as inferred from differences between his origi- 
nal and simulated test performances; they 
may prove of value in predicting length of 
hospitalization or outcome of therapy, since 
the ability to “improve” on the tests may re- 
flect a degree of reality orientation or ego- 
strength suggestive of a favorable prognosis. 

The present paper, which is part ofa a 
study involving tests tapp!ng different leve s 
of personality, presents preliminary findings 
based on the Minnesota Multiphasic Person- 
ality Inventory (MMPI). In undertaking this 
investigation, answers were sought to the fol- 
lowing questions: 

i indebted to W. L. Martinsen and 
pa a ee r Illustrations Laboratory for 
the figures and slides which they prepared; and to 
Saul Kupferman open ore iat for con- 
an e a bode psychology trainee at VA Neu- 
ropsychiatric Hospital, Los Angeles. 
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1. To what extent can psychiatric patients 
produce “normal” test performance when re- 
quested to do so? . 

2. What other kinds of changes in test per- 
formance occur; and are these different for 
patients of different diagnostic categories? 

3. May ability to give improved perform- 
ance when simulating normalcy be predictive 
of shorter hospitalization? 


Procedure and Results 


The experimental procedure was as follows: 
Patients took the MMPI routinely, on enter- 
ing the hospital. The next day the test was 
repeated, but with instructions to answer it 
“the way a typical, well-adjusted person on 
the outside, would do.” Upon completing the 
test, each patient was asked to describe how 
he did it, what his method was; and his com- 
ments were noted. Forty-five consecutively 
hospitalized male patients participated in the 
study. In each case, the psychiatric admis- 
sions-diagnosis was obtained from the pa- 
tient’s clinical folder. 

In order to see how many patients actually 
improved their performance, the authors in- 
dependently made blind sortings of the pairs 
of original and simulated profiles for each 
patient, based on the expectation of improve- 
ment in the simulated profile. Where both 
investigators correctly sorted the patient’s 
profiles, and where this coincided with a re- 
duction in the Total T score based on the 
sum of the nine clinical scales, the case was 
considered as improved. On this basis, 33 out 
of the 45 cases (73%) were judged as im- 
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Fig. 1. Mean profiles on original and on simulated 
performance, based on 33 “improved” cases, 
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proved in performance when simulating nor- 
malcy. The rest either did not improve, or 
became worse. Figure 1 shows the change in 
the over-all pattern for these 33 improved 
patients, 

The changes in the direction of the F and 
K scales are of interest, the patients showing 
increased defensiveness, K and decreased con- 
fusion, F. (If the F scale is considered simply 
as a validity indicator, the simulated perform- 
ance would appear to be more valid than the 
original!) It will be noted that the profile 
changes from that of a highly disturbed in- 
cipient paranoid schizophrenic to the double- 
spike curve of the “anxiety-free Psychopath.” 
It is as though the typical, disturbed schizo- 
phrenic patient were to say, “If only I could 
lose my anxiety and guilt feelings, and my 
feelings of personal inadequacy, and if I felt 
less inhibited in accepting and acting on my 
impulses, then I would not be the harrassed, 
mentally-ill person that I am.” 

Statistical analysis of the differences in the 
Profiles of the nine clinical variables was un- 
dertaken, based on a method recently devised 
by Gengerelli and Butler (5). This method, 
which takes into account the absolute magni- 


grees of freedom.* 


ty Two brofile-numbers were computed for each in- 
dividual (one representing his original Performance 


The kinds of diagnostic changes actually 
produced in the profiles of all 45 patients, im- 
proved and unimproved, are shown in Table 1. 
The table reveals that 28 of the 45 cases 
(62%) showed no change in diagnostic cate- 
gory, although many of these showed im- 
provement in terms of a “softening” or re- 
duction in the deviancy of the personality 
pattern. Only five cases (11%) became “nor- 
mal.” The remaining 27 per cent underwent 
a “diagnostic shift” to another category. 

Of the 24 schizophrenics, exactly half re- 
mained unchanged, while the rest converted 
to character disorders, psychoneurotics, psy- 
chosomatic, and “normal.” Of the 13 charac- 
ter disorders, 11 remained unchanged, while 
two became “normal.” Of the four manic-de- 
pressives, two remained unchanged; one be- 
came schizophrenic; and one a character dis- 
order. And of the four psychoneurotics, three 
remained unchanged although less deviant, 
while one gave a schizophrenic profile. 

Thus, although in general the simulated 
profile was better than the original, most pa- 
tients, under the conditions of the experiment, 
did not produce a “normal” performance, but 
rather changed the degree of severity or the 
nature of the behavior disorder. Figures 2 and 
3 show examples of two kinds of changes 
which occurred. Space considerations preclude 
the showing of other interesting examples. 

Figure 2 illustrates a “softened” deviancy 


Table 1 


“Diagnostic Shifts” Between Original and 
Simulated MMPI Profiles 


Num- 
Original diagnosis Simulated diagnosis ber 
Schizophrenia 12* 
Character disorder 6 
Schizophrenia (N= 24) Psychoneurosis 2 
Psychosomaticsyndrome 1 
Normal 3 
Character disorder Character disorder 1 ae 
(N=13) Normal 
x f 2* 
A depressive 
Manic depressive a 1 
W=%) Character disorder 1 
Psychoneurosis Psychoneurosis 3* 
(N=4) Schizophrenia 1 


* Diagnosis remained unchanged. 
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Fig. 2. An example of a “softened” deviancy pattern. 


pattern, and indicates a diagnostic shift from 
an incipient schizophrenic reaction to one 
of anxiety neurosis. Figure 3 illustrates a 
“mirror image” or reaction-formation pattern 
which sometimes results when the patient 
strives hard to deny unacceptable feelings or 
K impulses. This figure exemplifies a diagnostic 
PA shift from a decompensating obsessive-com- 
‘a pulsive neurosis to an acting-out type of 
character disorder. 

r The patients’ comments were frequently of inter- 
est, as evidenced by the following examples: It was 
pretty hard at first but then you just think: I'm 
Superman, there’s nothing wrong with me.”; “Well, 
I was thinking of my dad, for example. He’s always 
been my ideal.”; “Well, I just put down the opposite 
to what I did yesterday.”; “Through books and 
things you get to understand the average person.” ; 
“I have answered these questions as a so-called nor- 
mal person would (if there is such a person). What 
is a normal person?”; uy answered these questions 
as I hope to answer truthfully in the near future. 


The degree and nature of the changes have 
possible diagnostic and therapeutic implica- 
tions. For example, in some cases, especially 
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in the character-disorder class, the similarity 
of the two profiles, along with the verbaliza- 
tions, indicates that the patients do not feel 
there is anything wrong with them. As one 
patient put it, “I am well adjusted. I have 
no nervous problems.” Other patients, by the 
very mild softening of their deviant profiles, 
express that they are essentially normal. In 
the words of one patient, “What would you 
say if I told you I wouldn’t have to change 
mine a whole lot to be typically well ad- 
justed?” Others feel quite hopeless. For ex- 
ample, one patient said, “How can we an- 
swer as a typical, well-adjusted person if 
we're not?” The cases of “diagnostic shift” 
suggest that, for some patients, being normal 
involves the use of different (usually less 
deviant) adjustment patterns, including the 
elimination of bizarre symptoms. 

There was a significant reduction in Total 
T score for all 45 patients on the nine clinical 
scales, the critical ratio being 4.6. This means 
that most psychiatric patients were capable 
of recognizing and avoiding many of the in- 
dividual deviant responses, even though they 
were still largely unable to produce normal 
profile patterns. For the 33 “improved” cases, 
the critical ratio rose, as one would expect, 
to 6.3. There was also a significant reduction 
in the number of scales at or above the criti- 
cal T score of 70. 

As Table 2 shows, significant improvement 
took place on all scales except the Lie scale 
and the Mf and Ma scales. 


Table 2 
Changes in T Score on the MMPI Scales 


Mean Critical 
Scale change SEx ratio 
L — 04 0.47 0.9* 
F — 5.6 1.50 3.7 
K +35 0.74 47 
K-F + 9.1 2.10 4.3 
Hs —13.1 3.30 3.9 
D —22.6 4.00 5.6 
Hy —11.0 2.40 4.6 
På —14.0 2.36 5.9 
Mf — 59 15.10 0.4* 
Pa —14.8 2.90 5.1 
Pt —16.9 3.06 5.4 
Sc —20.9 3.50 6.0 
Ma — 10 1.80 0.6* 


* Not significant. 
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Apparently, the “improved” patients did 
not feel it necessary to present more of a so- 
cially acceptable front on the obvious items 
of the Lie scale, although defensiveness in- 
creased significantly on the more subtle K 
and K-F scales. Furthermore, the patients did 
not seem to consider the Mf and Ma scale 
items as indicative of pathology. All of the 
other scales, however, underwent significant 
change, the greatest mean differences in T 
score taking place on the D, Sc, and Pt scales. 

It seemed reasonable that the degree of 
test improvement might be a useful index of 
potentiality for clinica] improvement. It was, 
therefore, hypothesized that the greater the 
improvement shown when the original and 
simulated profiles were compared, the less 
would be the need for prolonged hospitaliza- 
tion. To test this, the number and correspond- 
ing percentage of correct identifications (trial 
visit or discharge, on the one hand, vs. con- 
tinued hospitalization, on the other) were ob- 
tained three months 


Sree of improvement when simulating “nor- 
malcy” and status 
Pitalized) three months later. 

At each criterion level, the discharged pa- 


tients who equalled or exceeded the criterion 


measure, and the hospitalized Patients who 
did not equal the criterion measure, were 
considered correctly identified by the appli- 
cation of that criterion, 

It will be seen from the table that the 
higher the criterion value, the greater the ac- 


Table 3 


Correct Identifications of Discharged ys, Hospitalized 
Status at Three Reduction Levels 


of MMPI Change 

Correctly Iden tified 

ee 

Dis- Hospi- 

A charged talized 
Reduction in total T score pts. pts. 
90 or more (=1 sigma per scale) 47% 81% 
65 or more (=0.7 sigma per scale) 68% 65% 
45 or more (=0.5 sigma per scale) 74%, 58% 
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curacy of prediction for 
the hospital. (That is, these patients were 
less capable of producing that much change.) 
On the other hand, the lower the criterion 
value, the greater the accuracy of prediction 
for patients who were out on trial visit or 
discharge status. (That is, more of these pa- 
tients could meet this less stringent criterion.) 

Since a criterion change of 65 or more 
gave the least number of false negative identi- 
fications for either group of patients, a 2 by 
2 chi-square test was performed. This yielded 
a chi-square value of 5.0 which proved sig- 
nificant, for 1 degree of freedom, at the .025 
level of confidence, indicating a significant 
relationship between ability to improve and 
early trial visit or discharge. 

It is of interest to note that neither the 
Total T score on the original performance, 
nor on the simulated performance, taken 
separately, showed any relationship to length 
of hospitalization. In other words, neither the 
initial degree of disturbance, nor the simu- 
lated degree of disturbance, is predictive; but 
only the change in degree of disturbance, 

Apart from predictive value, the double- 
testing approach herein employed seems to 
offer another advantage, namely that of high- 
lighting the subject’s problem areas. Such 
problem areas are not always immediately 
apparent either to the therapist or to the pa- 
tient. The kind of clear and immediate focus 
which this approach seems capable of yielding 
could be of real use to the diagnostician or 
therapist. For example, the patient whose al- 
tered profile declares, in effect: “I would be 
normal if I could accept my passive tendencies 
and feminine interests more comfortably,” or 
“Tf I had fewer doubts about my manhood, I 
could be less anxious,” may be helped early 
in the therapeutic process to view these feel- 
ings and attitudes in a perspective which 
minimizes any seriously disruptive effects they 
might have on treatment. The psychotherapist 
is thus enabled to correct the patient’s gross 
misconceptions early and, by helping the Da: 
tient to make a more realistic appraisal of his 
strengths as well as his limitations, to clear 
the way for further psychotherapeutic en- 
deavors. , 

Perhaps a word of caution is in order lest 
the data lead to unduly optimistic interpreta- 
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tions of the changes observed in the “im- 
proved” cases. It might be well to distinguish 
between favorable prognosis for early dis- 
charge from the hospital as against psycho- 
therapeutic accessibility. Many patients ca- 
pable of simulating normalcy on a lip-service 
basis might also be capable of producing su- 
perficial changes in their outward behavior 
which would enable them to be rated much 
improved and ready for release. Some of these 
patients may merely have gained temporary 
control of their erratic and frequently unman- 
ageable impulses without any new basic un- 
derstanding or mastery of these impulses. 
Others, who may recognize how much they 
are at variance with the rest of the world, 
may be unable to effect the basic changes 
which psychotherapy hopes to achieve. 


Summary 


In summary, this study revealed marked 
individual differences in the ability of psychi- 
atric patients to simulate “normalcy” on the 
MMPI. Although most of the patients (73%) 
gave an improved performance, very few 
(11%) became “normal” and some became 
worse. Ability to improve differed for pa- 
tients in different diagnostic categories. Im- 
provement was manifested, in many cases, by 
a reduction in the deviancy of the same diag- 
nostic pattern; in other cases, by a “diag- 
nostic shift” to a less seriously disturbed 
category. Areas of emotional disturbance ap- 
peared to be highlighted in terms of differ- 
ences between the patient’s self concept and 
his ego ideal, as these could be inferred from 
the changes that took place in the profiles. 
Improvability on the test appears to be a 
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favorable prognostic indication for early hos- 
pital discharge. Some diagnostic and thera- 
peutic implications of the double-testing ap- 
proach used in this study were briefly 
discussed. 


Received June 5, 1956. 
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Factors Influencing the Prediction of Behavior 
from a Diagnostic Interview’ 


Helene Borke* and Donald W. Fiske 


The University of Chicago 


This study explored the effects of different 
conditions upon the accuracy of prediction 
from a diagnostic interview. It differs from 
previous studies in two ways: it utilized di- 
rect interaction as one condition and it used 
responses to a nonverbal preference instru- 
ment as one of the behaviors to be predicted. 


Procedure 


Four judges studied four subjects (Ss) in 
a diagnostic interview situation. The Ss were 
male veterans of World War II, between the 
ages of 25 and 35, who were diagnosed as 
anxiety neurotics by two physicians, one of 
whom was a Psychiatrist. The judges were 
third- and fourth-year male VA trainees. 
Each judge came in contact with each of the 
Ss under one of the following conditions: di- 
rect interaction, observation of the interac- 
tion from behind a one-way screen, listening 
toa recording of the interview, and reading 
a verbatim transcript. The behavior to be 
Predicted consisted of two Q sorts (8) with 
fixed distributions: the S’s conscious atti- 
tudes about himself were reflected in a ver- 
bal Q sort consisting of 100 items based on 
Murray’s needs (6); his picture preferences 
were elicited by a Q sort of one hundred 


1 This study was conducted at the Hines Veterans 
Administration Hospital. The authors gratefully ac- 
knowledge the assistance of Mr. Frank Brogno, Mr. 
Edward Katz, Mr. Arthur Oriel, and Dr. John Mac- 
Gahan, who served as judges, 


picture-postcard reproductions of famous 
paintings selected according to six Rorschach 
determinants—form, color, mood, human 
movement, sex, and aggression. The verbal 
statements and a complete account of the 
selection of the pictures are given elsewhere 
(1). The Ss were told that both the Q sorts 
and the interview which followed were part 
of the regular hospital diagnostic routine. 
The judges made both sorts for themselves 
and also for a “typical” anxiety neurotic be- 
fore making predictions for the experimental 
Ss. Immediately after his contact with an 
S, each judge was asked to predict how he 
thought the particular S had sorted the ver- 
bal and picture materials. Following a latin- 
square design, each judge predicted for each 
of the Ss under a different condition. Each 
prediction sort was then correlated with the 
corresponding patient’s self-sort. It was also 
correlated with the judge’s sort for himself 
and for a “typical” anxiety neurotic.? The 
resulting correlation coefficients were ana- 
lyzed by W, the coefficient of concordance, a 
measure of association among sets of ranks * 
(5). 
Results 


Effects of Differences in Environmental Con- 
ditions on Predictions 


The type of contact the judges had with an 
S (interviewing, observing from behind a one- 
way screen, listening to a recording, or read- 


3 The measurement of accuracy of prediction by a 
correlation coefficient between two sorts with fixed 
distributions ignores some components contained in 
a total accuracy score, as analyzed by Cronbach 
(2). It is obvious that the findings of our study are 
restricted to the accuracy measure used. 
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ing a verbatim transcript) had no significant 
influence on their ability to predict the self- 
descriptions or the picture preferences as 
measured by the Q-sort technique. 


Effects of Judges on Prediction 


The judges’ preconceived ideas about how 
a “typical” anxiety neurotic might be ex- 
pected to behave significantly influenced their 
predictions for particular Ss on both the ver- 
bal and picture sorts. The median correlation 
between the judges’ predictions for a “typi- 
cal” anxiety neurotic and their predictions for 
the individual Ss was .53 on the verbal sorts 
and .50 on the picture sorts. At the same 
time the findings indicate that for the verbal 
material, actual contact with an S did result 
in improved predictions over the stereotype. 
In 14 of the 16 pairs of correlations, the 
judges’ predictions for particular Ss corre- 
lated higher with each S’s self-sort than did 
their predictions for the stereotype. Since the 
differential predictability of the Ss may be 
associated with these gains, they were evalu- 
ated by determining the probabilities for 
each S separately, and then determining the 
probabilities of the obtained set of four p 
values, via the chi-square transformation (4). 
The resulting p was less than .05. Since the 
judges did not differ in accuracy, it was as- 
sumed in this analysis that for each judge, 
the four differences between stereotype and 
specific prediction were independent. ue 

On the picture material, the predictions 
for the actual Ss were not significantly better 
than the predictions for the stereotype. 

The data suggest that projection did not 
play a large part in the predictions of the 
judges. For the most part the judges ap- 
peared to be fairly accurate in evaluating the 
extent to which the Ss’ attitudes agreed or 
disagreed with their own, i.e., judges who had 
high correlations between their predictions 
for the S and their sorts for themselves were 
the ones whose own sorts were actually quite 
similar to those of the S, whereas judges 
whose own sorts were very different from 
those of the S had negative correlations be- 
tween their prediction for the S and their 
own sorts. It was also found that similarities 
or differences between a particular judge and 
a particular S, as reflected in the correlation 


between their self-sorts, had no measurable 
effect on the judge’s accuracy in predicting 
either type of material. 


Effects of Behavior Being Predicted 


There were no significant differences in the 
accuracy of judges’ predictions for verbal and 
for picture material. When interpreting this 
finding, however, it should be kept in mind 
that the verbal predictions represented a 
significant improvement over the stereotype 
whereas the picture predictions did not. The 
higher agreement between the predictions for 
the stereotype and the Ss’ own sorts on the 
picture material may have resulted in part 
from the greater homogeneity of the Ss’ pic- 
ture sorts as compared with their verbal sorts. 
The intercorrelations between Ss on the pic- 
ture sorts ranged from .14 to .39, and on the 
verbal sorts from —.15 to .19. Since the 
stereotype represents a prediction for an av- 
erage, the greater the heterogeneity of the 
group, the poorer such an average becomes as 
a prediction for an individual S. Because of 
this difference in group heterogeneity on the 
two types of material, it is likely that a judge 
would have to improve considerably more 
over his prediction for the stereotype on the 
verbal material than on the pictures in order 
to achieve an accuracy score of equal size on 
both sorts. 


Effects of Subjects on Prediction 


There were significant differences (p = 
.01) between Ss in the accuracy with which 
the judges were able to predict their per- 
formances on both the verbal and picture 
material. One S was consistently predicted 
most poorly on both types of material. 


Discussion 


Perhaps the most unexpected finding of 
this study was that it made no difference in 
the judges’ predictions of attitudes about the 
self or picture preferences whether they in- 
terviewed an S directly, observed him, lis- 
tened to a recording of the interview, or read 
a verbatim transcript. However, these results 
are in close agreement with those of Segel 
(7) and Giedt (3). In an area where dis- 
agreements between experimental findings are 
common (9), such agreement is encouraging. © 
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Segel and Giedt found that judges predicted 
almost equally well whether they saw a com- 
plete sound film of an interview, listened to 
it, or read a verbatim transcript. Giedt also 
reported that judges who saw a silent mo- 
tion picture predicted significantly more 
poorly. The failure of auditory and visual 
cues to result in improved predictions, and 
the significantly less accurate predictions 
when visual cues alone were used (3), 
strongly suggest that judges rely primarily on 
the information provided by verbal content 
when predicting conscious attitudes about the 
self. 

An important reason for the almost exclu- 
sive use of content cues by the judges is the 
way the predictive process was defined. In 
most of these studies, the judges were asked 
to predict how the Ss might perform on some 
verbal task, i.e., sentence completion or Q 
sort, where one would expect the content to 
provide the most cues. It is possible that if 
understanding were defined differently as, for 
example, the ability to predict the feelings an 
S might be experiencing at a particular mo- 
ment, other cues might be utilized to a far 
greater extent. 


Summary 


The primary Purpose of the investigation 
Was to discover which cues contribute most 
to one person’s ability to understand another 
in a diagnostic interview. Following a latin- 
square design, four clinical psychologists 
studied each of four male anxiety neurotics 
under one of the following conditions: di- 
rect interview, seeing and hearing the inter- 
view through a one-way screen, listening to a 
recording of the interview, and reading a ver- 
batim transcript. The Psychologists were then 
asked to predict how each of the Ss had made 
a verbal Q sort consisting of self-descriptive 
items and a preference sort with pictures, The 
findings were: 

1. There were no significant differences in 
accuracy of prediction under the various ex- 
perimental conditions of direct interaction, 


observation, listening to a recording, and 
reading. The finding suggested that the cli- 
nicians in this study relied primarily on con- 
tent cues when making their predictions. 

2. The judges were found to be about 
equally skillful in their ability to predict be- 
havior. 

3. Although the judges relied heavily on 
their stereotypes of a “typical” anxiety neu- 
rotic in making their predictions, the patients’ 
verbal sorts were predicted more accurately 
by the judges’ specific predictions for indi- 
vidual patients than by their stereotypes. No 
such increase in accuracy was found for the 
picture sorts. 

Two factors appear to be involved in the 
relative accuracy of a judge’s specific predic- 
tions as compared to his sort for a typical 
patient. One is the judge’s familiarity with 
the predictive task. The other is the simi- 
larity between the behavior to be predicted 
and the behavior on which predictions are 
based. 


Received April 27, 1956. 
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Some Comments on the Measurement of 
Projection and Empathy 


Bernard I. Murstein * 


The University of Texas M. D. Anderson Hospital and Tumor Institute 


The concepts of projection and empathy 
have spurred considerable research in recent 
years with conflicting and often confusing re- 
sults. Much of the difficulty stems from the 
methodological approach to the measurement 
of these variables. In a recent article (2), 
Norman and Leiding have examined the re- 
lationship between the variables projection, 
empathy, and refined empathy. These vari- 
ables were examined with regard to both in- 
dividual and mass empathy tests. The Dy- 
mond individual empathy test measures the 
ability of a person to predict the self-rating 
of another person. The mass empathy test re- 
fers to a given individual’s ability to predict 
the self-responses of a group of persons. The 
authors concluded that there was no signifi- 
cant relationship between mass and individual 
empathy tests, although within each test, 
significant correlations existed between the 
aforementioned variables. 

It is the purpose of this paper to examine 
the various significant correlations found by 
Norman and Leiding, and demonstrate how 
these correlations are at least in part, spuri- 
ous, and thereby, weaken one’s confidence in 
the findings reported by these authors. 

In demonstrating this spuriousness, it is 
convenient to describe the steps utilized by 
Norman and Leiding in measuring each vari- 
able. Hence, an alphabetical shorthand (a, b, 
c, d,.-.) will be used in describing each 
new step in order to make the comparison of 
the various steps less cumbersome. 


1. Dymond Individual Empathy Test 


1a. Raw Empathy 
a) A rates B as he thinks B would rate 
himself. (b) A rates himself (A) as he thinks 


1 Now at Louisiana State University. 
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B would rate him. (c) B rates himself (B). 
(d) B rates A as he (B) sees him. 

A measure of A’s empathic ability is ob- 
tained by calculating the discrepancy of A’s 
predictions for B (steps @ and b) from B’s 
actual ratings (steps c and d). Thus Raw 
Empathy is derived from the equation Raw 
Empathy = 


(a—c) + =d) 


where the total score is the sum of the dis- 
crepancies without respect to sign. 


1b. Projection 


(a) A rates B as he thinks B would rate 
himself. (b) A rates himself (A) as he thinks 
B would rate him. (e) A rates B as he (A) 
sees B. (f) A rates himself (A). Projection = 


(a—e) + (f— 5) 
1c. Refined Empathy 


Refined Empathy = Raw Empathy — Pro- 
jection = 


[(a—c) + (b—d)] — [(a—e) + (f— 8)] 


2. Norman and Leiding’s Mass Empathy 
Test 
2a. Raw Empathy 


(g) In the total group tested, 51 per cent 
or more of a group answered one of the ques- 
tions of a personality inventory in a certain 
way (e.g., “Yes”). 

(h) On an alternate form, A answered in 
the same direction when he was required to 
judge about most other people. A Raw Em- 
pathy point is counted each time g and % oc- 
cur simultaneously. Raw Empathy = 


X(g, h) 
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2b. Projection 

(h) A says most other people will answer 
in a certain way. 

(i) A answers in the same way in judging 
himself (A) as he predicted that most other 
people would answer a given question. Pro- 
jection occurs each time % and i occur to- 
gether. Projection = 


=(h, i) 


2c. Refined Empathy 
Refined Empathy = Raw Empathy — Pro- 
jection = 


%(g, h) — 3(A, i) 


The operations involved in correlating Raw 
and Refined Empathy (r= 81) for Dy- 
mond’s test are 


Raw Empathy 
(a—c)+(b— d) 
vs. 
Refined Empathy 
(Raw Empathy — Projection) 
[(a—c) + Ca —[(a~e) + N 


It may readily be seen from the above no- 


tion: 


Refined Empathy 
[(e— c) + (6—4d)] —[(a—e) + (7 ț By] 
vs. 
Projection 


[(e— e) + (f— b)] 


In view of the position of the common 
components above, it may be seen that as the 
projection score increases, the common com- 
ponent in the refined empathy score, which 
must þe subtracted, becomes increasingly 
larger. Hence, a negative relationship would 
be expected between Refined Empathy and 


Projection even if no actual psychological re- 
lationship existed. 

The operations which result in an r of .86 
between the Raw Empathy and Projection 
variables within the Mass Empathy Test of 
Norman and Leiding are as follows: 


Raw Empathy 
3(g, h) vs. 


Projection 
X(h, i) 


Since operation % was common to both vari- 
ables, the correlation resulting, contained a 
degree of positive spuriousness, 

The last significant correlation is a nega- 
tive one (r= — -69) between Refined Em- 
pathy and Projection. The operations are 


Refined Empathy 
Raw Empathy — Projection Projection 
[3(g,4) — 3(h,i)] vs. X(h, i) 


Again, one may note that as the Projection 
score increases, the Refined Empathy score 
will drop solely on the basis of the method of 
measurement. Thus, a spurious negative rela- 
tionship is to be expected. Nin 

The study of such concepts as projection 
and empathy should be of utility in under- 
standing human behavior and in predicting 
behavior. Unfortunately, complicated meth- 
odologies involving the manipulation of sun- 
dry discrepancy scores are not only difficult 
to justify on logical grounds, but often con- 
tain spurious components which make the 
end scores invalid. It is possible, of course, to 
partial out these spurious elements as Calvin 
and Holtzman (1) have done. It seems more 
parsimonious, however, to construct a meas- 
uring instrument which operationally defines 
the concept to be examined without being 
statistically inappropriate. 
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Self Concept and Defensive Behavior 
in the Maladjusted 


Joseph S. Hillson 
Norfolk State Hospital 


and Philip Worchel 


University of Texas 


Considerable light has been thrown on the 
nature of the personality dynamics in the 
maladjusted by the many recent investiga- 
tions of the self concept. In order to evaluate 
the implications of the results of these stud- 
ies on self theory, it is necessary to distin- 
guish two phases in the development of mal- 
adjustment: the arousal of anxiety and the 
development of defensive behavior. Self the- 
ory (11) contends that tensions arise when 
the organism strives to satisfy needs not 
consciously admitted and to respond to ex- 
periences denied by the conscious self. Anx- 
jety is felt when the individual is aware of 
this tension or discrepancy. Sullivan (14), in 
similar fashion, states that anxiety appears 
when anything spectacular happens that is 
not welcome to the self. Defensive behavior 
develops in order to maintain the structure 
of the self, and as Rogers (11) suggests, the 
more perceptions of experiences inconsistent 
with the concept of self there are, the more 
rigid is the organization of the self-structure. 
When the self cannot defend itself any longer 
against deep threats, a psychological break- 
down or disintegration occurs. Hogan (10) 
describes eight steps in the pattern of threat 
and defense. Anxiety is reduced by denial or 
distortion of perceived experience. 

In general, the investigations thus far have 
dealt with the nature of the self-structure in 
subjects who are anxious and aware of their 
maladjustment. Maladjustment in these stud- 
ies has been defined by extreme scores on 
some measure of maladjustment (2535 ó, 9) 
or by voluntary requests for assistance In the 


solution of personal problems (4, 8, 13). If, 
as suggested by Hogan (10), defensive be- 
havior reduces the awareness of threat, then 
maladjusted subjects who have developed de- 
fense patterns should no longer admit incon- 
gruity between perceived experience and their 
self concepts. Thus it is hypothesized from 
self theory, that 


(a) schizophrenic subjects characterized by 
defensive patterns deny any incongruity be- 
tween the self and ideal concepts, and 

(b) neurotic subjects with anxiety reac- 
tions report an inconsistency between their 
self and ideal concepts. 


Further hypotheses concerning the nature 
of the self concept in the maladjusted are de- 
rived from Adlerian theory on the dynamics 
of the neuroses. The neurotic, according to 
Adler (1), sets up fictitiously high goals þe- 
cause of intense feelings of inferiority and 
abnormal need for power. These goals, being 
unrealistic, are unobtainable, and failure to 
achieve them results in increased feelings of 
anxiety and inferiority. Therefore, it is pre- 
dicted that 


(c) neurotic subjects depreciate themselves, 

(d) schizophrenics, who have developed de- 
fense patterns, rate themselves at least as 
well as normal subjects, and 

(e) the ideal concept is higher for the neu- 
rotic than for the normal or adjusted person. 


It is proposed to test the above hypotheses in 
the present study. 
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Table 1 ` 


Characteristics of the Subjects in Each Group 


Sex Age Educ. (Grade) 
ba —— 

Group N Male Female Mean SD Mean SD 
Normals 47 24 23 25.33 5.00 13.51 1.38 
Neurotics 37 19 18 30.26 10.30 12.25 3.90 
Schizophrenic 36 14 22 30.46 8.06 11.95 2.18 

Method in the present study.? The Inventory is com- 


Three groups of subjects were selected for 
the present investigation.t The normal group 
consisted of 47 students who were not cur- 
rently under treatment for emotional disturb- 
ances and who had never been under such 
treatment. They were largely drawn from col- 
lege sophomores and freshman nurses. The 
group representing subjects with some overt 
or reported anxiety about their condition con- 
sisted of 37 neurotic subjects currently under 
treatment for an emotional disturbance either 
on an inpatient or outpatient basis. Persons 
whose diagnosis included some form of char- 


matic delusions and delusions of persecution. 

Table 1 gives the means and standard de- 
viations for age and education for the three 
8roups as well as the number of males and 
females in each group. It will be noted that 
the groups, in general, are fairly well equated 
on all these variables. In addition, we tried 
as closely as possible to secure subjects from 
the same Socioeconomic class, 


Instrument and Procedure 


The Self-Activity Inventory (SAI) devel- 
oped by Worchel for the USAF was employed 


+The cooperation of Dr. H. T; Manuel of the 
Testing and Guidance Bureau, Dr. Paul L. White of 
the Health Center, University of Texas, Dr. S. Gold- 
stone, Baylor Medical College, and Dr. A. Foster, 
Galveston State Psychopathic Hospital, in the se- 
lection of subjects is gratefully acknowledged. 


posed of 54 statements describing responses 
to the arousal of hostility, achievement, 
sexual, and dependency needs. Almost all the 
responses selected, however, are considered 
ineffectual since they are likely to precipitate 
conflict with social requirements, or not re- 
duce the conflicts involved. To measure the 
intensity of the responses, the S$ is asked to 
indicate on a 5-point scale, from 1 indicat- 
ing never to 5 indicating always, how much 
of the time the activity described is like him 
(Self), how he would like to be (Ideal), and 
how it is like other people (Other). Thus, 
the higher the sum of the scores on any of 
the three columns, the more frequent are the 
ineffectual responses employed. A low score 
Tepresents the positive self-attitude or the ad- 
justed end of the continuum while a high 
Score represents the negative or maladjusted 
extreme. Some of the items on the Inventory 
are: 

1. Feels he must win in an argument; 

2. Plays up to others in order to advance his po- 

sition ; 
3. Refuses to do things because he is not good at 


them; , , Je 
4. Tries hard to impress people with his ability. 


The total discrepancy score for each sub- 
ject between self and ideal was obtained by 
summing the absolute discrepancy scores for 
each item while the discrepancy scores be- 
tween self and other concepts was the alge- 
braic difference between the total scores on 
each of the two concepts. The algebraic dif- 

* This study was supported in part by funds pro- 
vided under Contract AF 18(600)-913 with the 
USAF School of Aviation Medicine, Randolph Field. 
Correspondence concerning the use of the SAI should 


be addressed to Dr, Saul B. Sells, Department of 
Clinical Psychology, Randolph Field, Texas. 
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ference between self and other was calculated 
in order to show the direction of the differ- 
ence between self and other person. 

Thus the SAI provides one measure of the 
integrity of the self system, namely, the ab- 
solute difference between self and ideal con- 
cepts (Column I minus II) and two meas- 
ures of depreciation, the magnitude of the 
rating on the self (Column I) and the mag- 
nitude and direction of the difference between 
self and other person (Column I minus III). 


Results 

Selj-Consistency 

Significant differences in the discrepancy 
between self and ideal were predicted among 
the three groups of Ss. In order to perform 
the analysis of discrepancy scores between 
Self and Ideal (S-I), it was necessary to cor- 
rect the discrepancy scores to take into ac- 
count the fact that the size of the discrep- 
ancy is partly a function of the Self scores. 
To correct the obtained discrepancy scores, 
a scattergram was prepared for (S) against 
(S-I). The scattergram indicated clearly that 
the regression line for predicting discrepancy 
scores from (S) was linear and that the ar- 
rays were relatively homoscedastic. The prod- 
uct-moment correlation coefficient between 
(S) and (S-I) for all groups combined was 
0.74. The correlation was sufficiently high to 
warrant adjustment of the obtained discrep- 
ancy scores. Each discrepancy score was cor- 
rected by subtracting the predicted discrep- 
ancy score from the obtained discrepancy 


Table 2 


Means and SDs of the Concept Scores and of the 
Corrected Discrepancy Scores for all 
Groups of Subjects 


Group 
ae eee 
Schizo- 
Normal Neurotic phrenic 

Measure M SD M SD M SD 
Self 132.7 15.7 151.8 30.8 139.1 24.2 
Ideal 98.8 15.3 103.6 23.4 108.0 19.5 
Other 157.3 17.2 146.4 24.9 152.0 24.2 
(S-I) —3.2 16.6 5.8 21.9 —6.2 20.7 
(S-O)e —10.2 16.6 79 25.5 3.5 24.5 


Table 3 


Tests of the Significance of the Difference Between the 
Means of the Concept and Discrepancy 
Scores on the SAT 


Normal Normal Neurotic 
vs. vs. Schizo- vs. Schizo- 
Measure Neurotic phrenic phrenic 
Self 3.43** 1.38 2.00* 
Ideal 1.08 Piee a 0.87 
Other 2.26** 1.19 1.04 
(S-Do 2.07* 0.76 2.54** 
(S-O). sas 3.10* 0.80 


* Significant at the .02 level, 
** Significant at the .01 level. | 
Note.—One-tailed tests of significance presented. 


score. The data for the corrected scores, 
(S-I),, together with the data on self, ideal, 
and (S-O). are summarized in Table 2. Our 
hypotheses predicted that the discrepancy 
score for the neurotic would be greater than 
that of the normals and paranoid schizo- 
phrenics, and the discrepancy score for the 
schizophrenic would be at least equal to that 
of the normals. 

Table 3 presents the ¢ ratios of the differ- 
ences between the means of the (S-I)e, self, 
ideal, and (S-O), scores. The difference be- 
tween normals and neurotics on the (S-I) 
scores is 9.02 which, for a single-tailed test, 
is significant at the .02 level. This differ- 
ence indicates that the neurotics perceive a 
greater discrepancy existing between their 
Self and Ideal than do the normals when the 
effect of the self-rating is partialled out, as 
was predicted from theory. 

The only other significant difference at the 
.01 level is that between the psychotic and 
neurotic. As was predicted, the direction of 
the difference is such as to indicate that the 
psychotics perceive a smaller discrepancy be- 
tween Self and Ideal than do the neurotics. 


Self-Depreciation 


It was predicted that our neurotic subjects 
would evaluate themselves more unfavorably 
than the normals while the schizophrenics 
would rate themselves at least as favorably 
as the normal subjects. On the SAI, there- 
fore, the neurotics should have a significantly 
higher score (low self-evaluation) on Column 
I (Self) and a higher positive discrepancy 
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between Columns I and III (Self-Other) 
than normals, and the schizophrenics should 
have a score on Self (I) and discrepancy be- 
tween self and ‘other, at least equal to that 
of normals. 

Table 2 contains the means and standard 
deviations for the Self and the corrected dis- 
crepancy scores between Self and Other, for 
each of the three groups. Inspection of the 
raw data indicated that the distributions ap- 
proached a normal curve sufficiently so that 
t tests of differences between the means could 
be computed. Since the groups were unequal 
and the variances heterogeneous on the Self 
scores, a formula for £ was used which takes 

into consideration the variances of both sam- 

ples as estimates of the population variance. 

Table 3 shows the results of the tests of 
significance of the differences between the 
means of the Self and Self-Other scores for 
each group compared with the normals and 
with each other. The difference between the 

mean Self score of the normals and neurotics 
is 19.08 which is Significant beyond the .01 
level (¢ ratio of 3.43). Compared to the nor- 
mals, therefore, the neurotic rates himself 
more negatively, which confirms the predic- 
tion on the depreciated self-picture of the 
neurotic, 

The difference between the means of the 
normals and schizophrenics on the Self scores 
is 6.40 (¢ of 1.38) which, for a one-tailed 
test, is not significant at the .05 level. Thus 
the schizophrenic perceives himself as posi- 
tively as the normal person does, which is 
what was predicted. 

The difference between the means of the 
neurotics and schizophrenics, as we would ex- 
pect, is 12.70 (¢ of 2.00) which, for a one- 
tailed test, is significant beyond the .02 level. 
The direction of the difference indicates that 
neurotics see themselyes more negatively than 
do schizophrenics, 

: From Table 3 it is clear that both malad- 
Justed groups differ significantly from the 
normal group regarding the discrepancy be- 
tween their self concept and their concept of 
others. The differences between the means of 
the normals and neurotics and between the 
normals and schizophrenics are significant be- 
yond the .01 level, The direction of the dif- 

ferences in both instances was the same, in- 
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dicating that the two maladjusted groups 
have a greater discrepancy between their 
self concepts and their concept of others than 
do normals when the effect of their own self- 
rating is partialled out. This finding is in line 
with our hypothesis of depreciation in neu- 
rotics but opposite to what we predicted for 
the schizophrenics. If anything, the defensive 
pattern of projection in the schizophrenic 
would lead us to expect the smallest discrep- 
ancy between the self and other in the psy- 
chotic. 


The Ideal 


It was predicted from Adlerian theory that 
the neurotic would have a higher “ideal” 
(lower score on Column II) than that of the 
normals. No predictions could be made for 
the schizophrenic. The results show that 
there was no significant difference between 
the mean “ideal” scores of the neurotics and 
normals (Table 3). The difference between 
the normals and schizophrenics of 9.20, how- 
ever, is significant at the .01 level (é of 
2.34). The direction of the difference is such 
as to indicate that the schizophrenics have a 
lower ideal relative to that of normals. 


Discussion 
The Neurotic 


Taken as a group, the neurotic patient sees 
himself as employing behavioral patterns that 
are much more frequently ineffectual in meet- 
ing his needs than the normal person. His de- 
sired goals are no higher than those -of the 
normal, but he sees other people as being 
more effective in meeting their needs than the 
normal sees them. When we rule out differ- 
ences in self-perception between him and the 
subjects in the other two groups, he has a 
greater self-ideal discrepancy than the others, 
that is, he secures a self-ideal discrepancy 
significantly greater than that predicted by 
his self score. In addition, he is more self- 
depreciative. Calvin and Holtzman (5) also 
found that the tendency to enhance the self 
is inversely related to maladjustment in a 
group of college subjects. Thus we have a 
person who not only does not measure up to 
his ideals but sees himself as inferior to the 
average person. As a matter of fact, his group 
is the only one of the three groups whose 


mean self-score was larger than the “other” 
score. He is, in reality, a “miserable” person. 
These findings tend to confirm Adler’s view 
(1) of the neurotic with one modification. 
Adler pictures the neurotic as developing in- 
tense inferiority feelings with an overdevelop- 
ment of the need for power. These factors 
plus the underdevelopment of “community 
feeling” lead to the setting up of fictitiously 
high goals. The “ideal,” however, is ficti- 
tiously high not relative to others but only 
when compared to the evaluation of the self. 
The goals which are set are “fictitious” in re- 
lationship to what is perceived as accom- 
plished. Self-ideal discrepancy and self-other 
depreciation go hand in hand in the neurotic. 


The Schizophrenic 

As was predicted, there was no difference 
between the normals and schizophrenics on 
self-consistency. On self-depreciation, how- 
ever, one of the measures on the SAI (Self) 
confirms the hypothesis that there is no dif- 
ference between the normals and schizo- 
phrenics. On the rating of Self, the schizo- 
phrenics rate themselves as being as effective 
in their need-satisfaction patterns as the nor- 
mals but more effective than the neurotics. 
By implication, we would have expected the 
schizophrenics to possess the least effective 
adjustmental patterns. Thus, as was sug- 
gested in the hypotheses, his self-ratings are 
probably a result of defensive distortion. 

On the other measure of self-depreciation, 
the corrected discrepancy score between self 
and other person, the results were contrary 
to our prediction. The schizophrenics rated 
themselves more depreciatively than the nor- 
mals. This depreciation was due to the fact 
that the patients tended to enhance other 
people (Column III) relative to self more 
than do the normals. 

On the ideal, however, the psychotic sub- 
ject has a significantly lower aspiration level 
than the normal person. He prefers to behave 
less effectively. This lowering of the ideal self 
could be an extension of the defensive distor- 
tion of the self concept. This defensive com- 
bination, distortion in self-appraisal and 
lowering of the ideal self, enables the schizo- 
phrenic to enhance himself relative to his 
ideal and thus avoid the anxiety arising from 
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a discrepancy in the self. The obtained dis- 
crepancy score, as we would expect, there- 
fore, is smaller than that predicted by his 
self-rating. Thus one can readily understand 
the lack of anxiety and the absence of motiva- 
tion for therapy in the schizophrenic patient. 


Summary 


On the basis’ of self theory and Adlerian 
theory, the following hypotheses concerning 
the nature of the self-system in maladjusted 
subjects presenting anxiety and defensive pat- 
terns were proposed for the present investi- 
gation: (a) Maladjusted subjects character- 
ized by anxiety would present a depreciated 
self picture, report high ideals, and show a 
high discrepancy between self and ideal con- 
cepts, (b) Maladjusted subjects with defen- 
sive patterns would show little discrepancy 
between self and ideal and would present a 
picture of the self similar to that of normals. 

A Self-rating inventory (SAI) consisting of 
54 statements of need-satisfaction patterns 
was administered to 47 normals, 37 neurotics 
(the “anxious” group), and 36 schizophrenics 
(group with defensive patterns). The inven- 
tory yielded Self, Ideal, and Other scores for 
each subject. In addition, corrected discrep- 
ancy scores between Self and Ideal, and be- 
tween Self and Others were computed. The 
results show that: 


1. The neurotic group gave significantly 
poorer self-appraisals than the other two 
groups. The normals and schizophrenics gave 
practically similar self-appraisals. 

2. On the ideal, the neurotic was not sig- 
nificantly different from the normals, but the 
schizophrenics set their level significantly 
lower than that of the normals. 

3. When the effect of the self-rating is par- 
tialled out, the self-ideal discrepancy for the 
neurotics is significantly greater than that for 
the normals and schizophrenics. There was no 
difference between schizophrenics and nor- 
mals. 

4. On the corrected ‘self-other discrepancy, 
the normals differed significantly from the 
two maladjusted groups. Whereas they tended 
to enhance themselves, relatively speaking, 
the maladjusted groups tended to depreciate 
themselves when the effect of the self-rat- 


ing was partialled out. There were no differ- 
ences in this regard among the maladjusted 
groups. 
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Another Application of the Spiral Aftereffect 
in the Determination of Brain Damage 
H. A. Page, G. Rakita, 


University of Wisconsin 


H. K. Kaplan, and N. B. Smith 
Mendota State Hospital 


This study represents another attempt to 
differentiate between patients diagnosed as 
organic from those without such pathology in 
evidence. Price and Deabler (3) have re- 
ported considerable success with the use of 
an Archimedes Spiral in spotting patients 
with organic brain involvement. On the basis 
of earlier studies (1, 4) they hypothesized 
that organic Ss would be relatively incapable 
of perceiving a negative figural aftereffect. 
The S is exposed to a rotating spiral which 
is then stopped. The negative figural after- 
effect is evidenced in the impression that the 
motionless spiral is turning in the opposite 
direction, changing in size, or moving back- 
ward or forward in space. Their results dra- 
matically support this hypothesis in that none 
of the nonorganic normal or psychiatric con- 
trol Ss failed to give some evidence of a nega- 
tive figural afteraffect while 60% of the or- 
ganic Ss indicated a complete absence of the 
effect. In addition, over 90% of the control 
Ss reported a “full” effect while only 2% of 
the organic group received a comparable rat- 
ing on the basis of their verbal reports. 

The present study was undertaken in the 
interest of providing an additional test of the 
diagnostic power of the spiral aftereffect and 
to do so within the context of a carefully 
matched control group. The authors were 
concerned with the possibility that differences 
between organic and control groups in such 
factors as age, intelligence, and chronicity or 
length of hospitalization might have operated 
to enhance the differences noted in the ear- 


lier investigation. 


Method 
Apparatus 


The apparatus was essentially similar to 
that described by Price and Deabler. Differ- 
ences involved the use of an 8-inch disc rather 
than a spiral of 6-inch diameter. The disc 
was driven by a spring-powered phonograph 
motor which permitted a speed of 100 r.p.m. 
Such an arrangement seemed preferable in 
terms of its light weight and the independ- 
ence of electric outlets. The motor was 
mounted in a gray wooden cabinet 17 inches 
high and 19 inches wide. 


Procedure 


Initial instructions to Ss and their place- 
ment with respect to the apparatus were 
identical to those described by Price and 
Deabler. However, one rather than two spi- 
rals were employed and Ss were administered 
three rather than four spirals. The spiral pro- 
vided an illusion of expansion while in mo- 
tion with a negative figural aftereffect of 
backward movement or shrinking when rota- 
tion ceased. On each trial the spiral was set 
in motion for 30 seconds. The S was asked 
what the line appeared to be doing. The disc 
was then stopped and any verbalization by S$ 
to the effect that the disc was moving back- 
ward in space, shrinking, or revolving in re- 
verse was accepted as evidence of a negative 
figural aftereffect. If S gave such an indica- 
tion, he was requested to indicate when the 
aftereffect was no longer apparent. The time 
was recorded. 
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Table 1 
Comparison of Organic and Control Groups on Matching Variables 
Organic Control 
ee ees l 
Variable Mean Sigma Range Mean Sigma Range 
Age 47.25 14.56 25-72 46.85 14.51 21-68 
F ti 
aa 10.63 3.79 6-16 8.75 2.41 3-12 
Length of 
oapitsliagion 3.83 4.92 08-15 3.90 4.68 08-15 
Subjects significantly in either mean values or vari- 


All Ss were patients at the Mendota State 
Hospital, Madison, Wisconsin. The organic 
group consisted of 20 patients (10 male, 10 
female). Considering the primary diagnosis, 
there were seven cases of cerebral arterio- 
sclerosis, five prefrontal lobotomies, three 
convulsive disorders, two Korsakoff’s syn- 
drome, one traumatic brain damage, and two 
cerebral vascular accident. All of the patients 


were considered to be suffering from Cortical 
damage, 


The control 
tients (10 male, 


Table 2 


Frequency of Reported Figural Aftereffect in 
Organic and Control Groups 


Trial Organic Control Probability 
Trial 1 6. 15 <.02 


Trial 2 5 17 
1 <.01 
Trial 3 5 16 <.01 
Total 8 17 <.01 
(Aftereffect reported 


on at least one trial) 


ance for any of the three variables. 


Results 


Analysis was made of the difference in the 
proportion of Ss in the organic and control 
groups for each of the three trials as well as 
the total trials. This analysis is presented in 
Table 2. It will be noted that significant chi 
squares were obtained in comparing the inci- 
dence of the aftereffect across the two groups 
for all three trials and total trials. 

A comparison was made between the two 
&roups in terms of the magnitude of the ef- 
fect as determined by Ss’ verbal reports sug- 
gestive of its duration. A Mann-Whitney U 
test failed to reject the null hypothesis indi- 
cating that differentiation between the groups 
could not be accomplished by considering the 
reported length of the aftereffect. In addition 
rank correlations, tau, were run for the con- 
trol Ss between the magnitude of the effect, 
on the one hand, and age, educational level, 
and length of hospitalization on the other. 
None of these correlations was demonstrated 
to be statistically reliable. 

It may be of interest to note that of the 
five cases of prefrontal lobotomy included in 
the organic group, three reported the after- 
effect. This incidence of aftereffect places 
these patients midway between the control 
group and the remaining brain-damaged pa- 
tients. 

Discussion 


The principal findings of this research sup- 
port those reported by Price and Deabler. 
However, these results fail to attribute to the 
spiral aftereffect a discriminatory ability of 
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the power suggested by their findings. It is 
suggested that the replication of differentia- 
tion between organic and control groups adds 
more weight to the theoretical importance of 
the measure. In fact, the incidence of no ef- 
fect in the organic group is exactly the same 
as noted in the previous study. The practical 
significance of the spiral aftereffect though is 
diminished somewhat on the basis of the cur- 
rent findings. In the clinical setting, the diag- 
nosis of organicity must typically be made 
among individuals who resemble one another 
in such characteristics as age, socioeconomic 
background, cooperativeness, and so forth. If 
one considers the control group employed in 
this research to be representative of the kind 
of patients typically involved in such dis- 
criminations, it is apparent that the overlap 
between the groups would frequently result 
in false negatives and less frequently in 
false positives. 

Considering the test on an all-or-none ba- 
sis, that is, the effect is reported or is not re- 
ported, our data would suggest that 40% of 
the organic group would not be so identified. 
Conversely, some 15% of the nonorganic pa- 
tients would be inaccurately described as 
suffering cranial damage. 

Tt is conceivable that the measurement of 
the spiral aftereffect could be improved con- 
siderably if it were not dependent upon a 
subjective verbal report. The authors are cur- 


rently attempting to develop a procedure 
which more nearly parallels the nonverbal 
procedures utilized in the determination of 
the kinesthetic figural aftereffect. 


Summary 


Significant differences were obtained be- 
tween a group of patients suffering cortical 
brain damage arid a matched group of pa- 
tients with a functional diagnosis in the per- 
ception of a negative spiral figural after- 
effect. Organic patients were less likely to re- 
port the effect. The results were interpreted 
as providing additional support to the theo- 
retical implications of this measurement, but 
as providing less evidence that the aftereffect 
may serve as an effective diagnostic device. 
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New Tests 


Brainard, Paul P., & Brainard, Ralph T. Brainard 
Occupational Preference Inventory. High school- 
adult. 1 form. Untimed, (30) min. Booklet ($5.50 
per 25), with key, and manual, pp. 12; IBM an- 
swer sheet ($1.60 per 25); specimen set (60¢). 
New York: Psychological Corp., 1945, 1956, 

The Bainard inventory, which has been under in- 
termittent development since 1922, received some re- 


and personal 


culty of the items has been kept at a low level. The 
reliabilities of the individual scales, for grades 10 and 


Scores are mainly low, indicating a fair degree of in- 
dependence, Validity, aside from a 


earlier ones—Z, F, S. 
Cooperative School and College 
(SCAT). Examiner’s manual: 
1956, pp. 11; Second Supplement/1956 
; Pp. 6. 
Princeton, N. J. & Los Angeles, Calif.: Educa- 
tional Testing Service, 1956, 
Fulfilling the promises made when the tests were 
first issued, the supplements Provide additional nor- 


Ability Tests 
First Supplement/ 
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Flanagan, John C. Flanagan Aptitude Classification 
Tests (FACT). Supplementary manual: Interpret- 
ing Test Scores, Pp. 12. Chicago: Science Research 
Associates, 1956. 

When they were first Published in 1953, it was 
quite evident that FACT needed occupational norms 
based on the scores of Persons tested as students who 
had later entered various fields of work (see J. con- 
sult. Psychol., 1954, 18, 231-232). The present book- 
let presents data which begin to fulfil this require- 
ment. Pittsburgh students tested in the standardiza- 
tion of the test were followed into 23 occupations 
requiring little preparation, and into training courses 
for 19 occupations of higher level. Data on those suc- 
cessful and satisfied in work or training provide 
tentative occupational standards. For example, an 
occupational stanine of 7 seems needed for the study 
of engineering, but one of 3 often suffices for work 
as a salesperson. The present standards, while use- 
ful, are properly labeled as tentative—L. F. S. 


Gorham, Donald R. Proverbs Test. Grade 5-adult. 
3 forms, clinical form; 1 form, best answer form. 
Untimed, (20-40) min. Test blank, clinical forms 
I, I WI (0 ea.) ; scoring card for each clinical 
form; test booklet, best answer form (1); answer 
sheet (25); scoring stencils; general manual, pp. 
12; clinical manual, pp. 17; complete kit ($4.50 
for quantities indicated; replacements available), 
Grand Forks, N. D.: Psychological Test Special- 
ists, 1956. 

Proverbs tests, and their close relatives the fables, 
have been used by psychology and psychiatry for at 
least 50 years, Gorham’s versions have received a 
reasonable degree of validation and standardization. 
Although by no means without faults when judged 
by the most rigorous technical standards for test 
construction, they seem significantly better than the 
off-hand versions frequently used. In the “clinical” 
forms of the Proverbs Test, the examinee writes what 
each of 12 proverbs means. Three well-equated 
“clinical” forms permit lengthening the test, or re- 
testing without repetition of content. Explicit, printed 
scoring scales permit an interscorer reliability of .95; 
estimates of subject reliability are .79 for one form, 
‘88 for two, and .92 for all three. The 40-item “best 
answer” version is a multiple-choice test with a cor- 
rected split-half reliability of .88. The clinical and 
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best answer forms correlate from .81 to .90. The 
tentative norms, which need improvement, are based 
on modest numbers of pupils from grades 5 to 12 
and college students, all from Texas, and substantial 
numbers of Air Force enlistees. What does the test 
measure? Although a logical case can be made for 
“abstract thinking,” correlations with other tests seem 
to reveal mostly that ubiquitous factor, verbal in- 
telligence (r with vocabulary, .80). Clinical studies 
of patients, however, show that the test may tap 
schizophrenic disturbances of thought processes, espe- 
cially if interpreted in relation to vocabulary. The 
special “abstract” and “concrete” scores obtainable 
from the test are of interest in relation both to men- 
tal development and to psychopathology.—L. F. S. 


Harrison, M. Lucille, & Stroud, James B. Harrison- 

Stroud Reading Readiness Profiles. Grades kgn-1. 

1 form. 5 subtests administerable to small groups, 

1 subtest individual; (80) min., in 4 sessions. Test 

booklets, pp. 12 ($3.45 per 35) with class record 

sheet, and teachers manual, pp. 23; specimen set 

(60¢). Boston: Houghton Mifflin, 1956. 

The Reading Readiness Profiles are group tests of 
the abilities and skills which children use in learning 
to read. The subtests require using symbols, making 
visual and auditory discriminations, using contexts, 
and knowing the letters of the alphabet. The mate- 
rials seem attractive and practicable for use with 
five- and six-year-olds. The six parts are interpreted 
as a profile, not combined to produce a single score. 
Interpretations are made in terms of five degrees of 
reading readiness, and are related to the instructional 
needs of individual pupils. Percentile norms for the 
subtests are adequately based on a national sample 
of over 1,400 pupils. The profile method of inter- 
pretation raises the common hazard of applying 

d from brief subtests. For example, the 


scores obtaine: b ) 
20th, 40th, and 60th percentiles are used as impor- 
f t 1 these percentiles 


tant cutoff points, and on Tesi 
are represented by raw scores of 17, 19, and 20 
points, which may not be significantly different. No 


technical data on reliability are given, a fault per- 


haps excusable in a manual intended for the use of 
cusable is the absence of 


teachers. But surely not ex : E 
any precautionary statements about drawing conclu- 
sions from small differences in raw scores. Validation 
is described in terms of content validity, with some 
data on the internal consistency coefficients of items. 
No evidence is given of the validity of the test for 
predicting subsequent progress 1n learning to read. 
In these respects, the test falls far short of meeting 
appropriate technical standards—L. F. S. 


Leiter, Russell G. Leiter Adult Intelligence Scale. 
Manual, pp. 52 ($2.00). Chicago: C. H. Stoelting 
Co., no date (1956). 

The Leiter Adult Intelligence Scale is a brief indi- 
al test for adults which requires about 40 min- 
utes of administration time. It seems close kin to 
the Army Individual Test of World War II, to 
which the author made considerable contributions. 
The three language subtests are similarities-differ- 
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ences, digits forward and backward, and a story 
memory test; the three nonverbal are a pathways 
test, stencil designs, and painted cubes. Four of these 
six have been previously reviewed as separate tests 
(J. consult. Psychol, 1949, 13, 386; 1950, 14, 162- 
163). The manual contains sufficient directions for 
administering and scoring, and tables of adult IQ 
norms for each subtest, the two subtotals, and the 
total score. Reliabilities, intercorrelations, sex differ- 
ences, and correlations with other tests are satisfac- 
torily reported for several groups. There is no state- 
ment about the sample used for the derivation of the 
norms. Implications from previously published stud- 
ies suggest that the IQ norms were obtained entirely 
from 256 young male veterans who had also been 
given the Stanford Binet. Since the reported studies 
of sex differences concern college women and men 
only, the extension of the norms to women, and in- 
deed to men in general, seems a bit risky.—L. F. S. 


Leiter, Russell G. Leiter International Performance 
Scale. Manual, Part I, Evidences of reliability and 
validity, pp. 72 ($2.50). Chicago: C. H. Stoelting 
Co., no date (1956). 

The manual is a well-documented collection of data 
about the Leiter International Performance Scale 
(LIPS), and draws both on hitherto unpublished 
studies by its author and published articles by the 
author and others. Most of the detailed information, 
including all of the author’s own material, relates to 
versions of the scale earlier than the current 1948 re- 
vision. Still, the development of the test has been 
continuous, and the earlier data are of relevant in- 
terest. Item analyses, frequency distributions of the 
scores of various groups, and correlations with tests 
and other meaningful criteria reveal that the LIPS 
is probably a useful psychometric instrument. It 
correlates about .70 to .80 with the Stanford Binet 
and a little higher with performance tests such as 
the Arthur Scale and the Progressive Matrices, Sur- 
prisingly lacking is any clear evidence which would 
defend the word “international” in the Scale’s title— 
its degree of freedom from cultural biases. This cen- 
tral issue is not treated explicitly, but data on the 
administration of the 1936 version to various races 
in Hawaii seem to suggest that the LIPS is scarcely 
more culture-free than the Binet. Part II of the 
manual, the directions for administering the 1948 re- 
vision, had been published previously (Psychol. Serv. 
Cent. J., 1950, 2, 259-343) —L. F. S. 


Seashore, Harold G., & Bennett, G. K. Seashore- 
Bennett Stenographic Proficiency Test. Manual, 
Revised 1956, pp. 8. New York: Psychological 
Corp., 1956. 

The test is a worksample, now available on phono- 
graph records and tape (see J. consult, Psychol., 
1947, 11, 341). The revised manual contains norms 
based on 1,475 applicants and 154 experienced em- 
ployees. The correlations of the test with various 
ratings of stenographic ability run from .50 to .70— 
Diy Bi, Os 
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Seashore Measures of Musical Talents. Manual, Re- 
vised 1956, pp. 11. New York: Psychological Corp., 
1956. 

The new manual, prepared by the Test Division 
Staff of the Psychological Corporation, replaces the 
1939 edition by C. E. Seashore and others. The 
manual describes the tests, gives instructions for ad- 
ministration and scoring, and emphasizes precautions 
in interpretation. Validity is still defended in terms 
of content, and no evidences of prediction are cited. 
Reliabilities of the parts range from .55 to .84 at 
various grade levels. Percentile norms are now based 
on substantial thousands of cases.—L. F. S. 


Terman, Lewis M. Concept Mastery Test. Superior 
adults. 1 form. Untimed, (40) min. Test booklet 
($3.00 per 25) with keys, and manual, pp. 10; 
IBM answer sheet ($1.85 per 50); specimen set 
(35¢). New York: Psychological Corp., 1956. 

In 1939 and 1950, Terman constructed two forms 
of a verbal ability test designed to have sufficient 
ceiling for use in the follow-up studies of his intel- 
lectually gifted children as adults, The second ver- 
sion, called Form T, is now released for use with 
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examinees such as superior graduate students for 
whom few if any other tests have sufficient diffi- 
culty. The test consists of two parts, a synonym- | 
antonym subtest which seems to measure vocabulary 
level and sensitivity to perceiving the relations be- 
tween verbal concepts, and an analogies subtest 
which probes concepts drawn from a wide range of 
informational areas. The alternate-form reliability is 
reported as .94 for students with an interval. of d 
one day to one week, and a remarkable .87 for the f 
g, 


gifted subjects when retested after 11 to 12 years. j 
Validity is explored by correlations with other tests 
(e.g, .70 with the CEEB’s SAT), and relationships 
to the original IQs of the gifted subjects and to their 
educational attainments. The test correlates .49 with 4 
the grade-point averages of college students, and 37 |. 
with the four-year undergraduate record of graduate 
students. No other predictive validities are cited. The 
range of the 190-item test is indicated by the 95th 
percentile score of the gifted subjects of 177, and 
the mean score of 35 of a small group of Air Force 
personnel with less than high school education. The A 
test is an interesting contribution to research on the a 
highest levels of verbal ability —L. F. S. 
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The Necessary and Sufficient Conditions of 
Therapeutic Personality Change 


Carl R. Rogers 


University of Chicago 


For many years I have been engaged in 
psychotherapy with individuals in distress. 
In recent years I have found myself increas- 
ingly concerned with the process of abstract- 
ing from that experience the general prin- 
ciples which appear to be involved in it. I 
have endeavored to discover any orderliness, 
any unity which seems to inhere in the subtle, 
complex tissue of interpersonal relationship in 
which I have so constantly been immersed in 
therapeutic work. One of the current prod- 
ucts of this concern is an attempt to state, in 
formal terms, a theory of psychotherapy, of 
personality, and of interpersonal relationships 
which will encompass and contain the phe- 
nomena of my experience.* What I wish to do 
in this paper is to take one very small seg- 
ment of that theory, spell it out more com- 
pletely, and explore its meaning and useful- 


ness. 
The Problem 


The question to which I wish to address 
myself is this: Is it possible to state, in terms 
which are clearly definable and measurable, 
the psychological conditions which are both 
necessary and sufficient to bring about con- 
structive personality change? Do we, in other 
words, know with any precision those ele- 

1 This formal statement is entitled “A theory of 
therapy; personality and interpersonal relationships, 
as developed in the client-centered framework,” by 
Carl R. Rogers. The manuscript was prepared at the 
request of the Committee of the American Psycho- 
logical Association for the Study of the Status and 
Development of Psychology in the United States. It 
will be published by McGraw-Hill in one of several 
volumes being prepared by this committee. Copies of 
the unpublished manuscript are available from the 
author to those with special interest in this field. 


ments which are essential if psychothera- 
peutic change is to ensue? 

Before proceeding to the major task let me 
dispose very briefly of the second portion of 
the question. What is meant by such phrases 
as “psychotherapeutic change,” “constructive 
personality change”? This problem also de- 
serves deep and serious consideration, but for 
the moment let me suggest a common-sense 
type of meaning upon which we can perhaps 
agree for purposes of this paper. By these 
phrases is meant: change in the personality 
structure of the individual, at both surface 
and deeper levels, in a direction which cli- 
nicians would agree means greater integration, 
less internal conflict, more energy utilizable 
for effective living; change in behavior away 
from behaviors generally regarded as imma- 
ture and toward behaviors regarded as ma- 
ture. This brief description may suffice to in- 
dicate the kind of change for which we are 
considering the preconditions. It may also 
suggest the ways in which this criterion of 
change may be determined.* 


The Conditions 


As I have considered my own clinical ex- 
perience and that of my colleagues, together 
with the pertinent research which is avail- 
able, I have drawn out several conditions 
which seem to me to be necessary to initiate 
constructive personality change, and which, 
taken together, appear to be sufficient to in- 
augurate that process. As I have worked on 
this problem I have found myself surprised 
at the simplicity of what has emerged. The 

2 That this is a measurable and determinable cri- 


terion has been shown in research already completed. 
See (7), especially chapters 8, 13, and 17. 
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statement which follows is not offered with 
any assurance as to its correctness, but with 
the expectation that it will have the value of 
any theory, namely that it states or implies 
a series of hypotheses which are open to proof 
or disproof, thereby clarifying and extending 
our knowledge of the field. 

Since I am not, in this paper, trying to 
achieve suspense, I will state at once, in se- 
verely rigorous and summarized terms, the 
six conditions which I have come to feel are 
basic to the process of personality change. 
The meaning of a number of the terms is not 
immediately evident, but will be clarified in 
the explanatory sections which follow. It is 
hoped that this brief statement will have 
much more significance to the reader when 

he has completed the paper. Without further 

introduction let me state the basic theoreti- 
cal position. 

For constructive personality change to oc- 
cur, it is necessary that these conditions exist 
and continue over a period of time: 
ae Two persons are in psychological con- 

_ 2. The first, whom we shall term the client, 
1s In a state of incongruence, being vulnerable 
or anxious. 

a aa n mond person, whom we shall term 

erapist, is congr i 5 

the aD. eee tor Mitegratea in 


4. The therapist experiences unconditional 
Positive regard for the client, 

5. The therapist experiences an empathic 
understanding of the client’s internal frame of 
reference and endeavors to communicate this 
experience to the client. 

6. The communication to the client of the 
therapist’s empathic understanding and un- 
conditional positive regard is to a minimal 
degree achieved. 


No other conditions are necessary. If these 
six conditions exist, and continue over a pe- 
riod of time, this is sufficient. The process of 
constructive personality change will follow. 


A Relationship 
The first condition specifies that a minimal 
relationship, a psychological contact, must 
exist. I am hypothesizing that . significant 
positive personality change does not bend a 
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cept in a relationship. This is of course an 
hypothesis, and it may be disproved. 

Conditions 2 through 6 define the charac- 
teristics of the relationship which are re- 
garded as essential by defining the necessary 
characteristics of each person in the relation- 
ship. All that is intended by this first condi- 
tion is to specify that the two people are to 
some degree in contact, that each makes some 
perceived difference in the experiential ñeld 

of the other. Probably it is sufficient if 
each makes some “subceived” difference, even 
though the individual may not be consciously 
aware of this impact. Thus it might be diffi- 
cult to know whether a catatonic patient per- 
ceives a therapist’s presence as making a dif- 
ference to him—a difference of any kind— 
but it is almost certain that at some organic 
level he does sense this difference. 

Except in such a difficult borderline situa- 
tion as that just mentioned, it would be rela- 
tively easy to define this condition in op- 
erational terms and thus determine, from a 
hard-boiled research point of view, whether 
the condition does, or does not, exist. The 
simplest method of determination involves 
simply the awareness of both client and 
therapist. If each is aware of being in per- 
sonal or psychological contact with the other, 
then this condition is met. 

This first condition of therapeutic change 
is such a simple one that perhaps it should 
be labeled an assumption or a precondition 
in order to set it apart from those that fol- 
low. Without it, however, the remaining items 
would have no meaning, and that is the rea- 
son for including it. 


The State of the Client 


It was specified that it is necessary that 
the client be “in a state of incongruence, be- 
ing vulnerable or anxious.” What is the mean- 
ing of these terms? 

Incongruence is a basic construct in the 
theory we have been developing. It refers to 
a discrepancy between the actual experience 
of the organism and the self picture of the 
individual insofar as it represents that experi- 
ence. Thus a student may experience, at a 
total or organismic level, a fear of the uni- 
versity and of examinations which are given 
on the third floor of a certain building, since 
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these may demonstrate a fundamental inade- 
quacy in him. Since such a fear of his inade- 
quacy is decidedly at odds with his concept 
of himself, this experience is represented (dis- 
tortedly) in his awareness as an unreasonable 
fear of climbing stairs in this building, or any 
building, and soon an unreasonable fear of 
crossing the open campus. Thus there is a 
fundamental discrepancy between the experi- 
enced meaning of the situation as it registers 
in his organism and the symbolic representa- 
tion of that experience in awareness in such 
a way that it does not conflict with the pic- 
ture he has of himself. In this case to admit 
a fear of inadequacy would contradict the 
picture he holds of himself; to admit incom- 
prehensible fears does not contradict his self 
concept. 

Another instance would be the mother who 
develops vague illnesses whenever her only 
son makes plans to leave home. The actual 
desire is to hold on to her only source of 
satisfaction. To perceive this in awareness 
would be inconsistent with the picture she 
holds of herself as a good mother. Illness, 
however, is consistent with her self concept, 
and the experience is symbolized in this dis- 
torted fashion. Thus again there is a basic 
incongruence between the self as perceived 
(in this case as an ill mother needing atten- 
tion) and the actual experience (in this case 
the desire to hold on to her son). 

When the individual has no awareness of 
such incongruence in himself, then he is 
merely vulnerable to the possibility of anxiety 
and disorganization. Some experience might 
occur so suddenly or so obviously that the in- 
congruence could not be denied. Therefore, 
the person is vulnerable to such a possibility. 

If the individual dimly perceives such an 
incongruence in himself, then a tension state 
occurs which is known as anxiety. The in- 
congruence need not be sharply perceived. It 
is enough that it is subceived—that is, dis- 
criminated as threatening to the self without 
any awareness of the content of that threat. 
Such anxiety is often seen in therapy as the 
individual approaches awareness of some ele- 
ment of his experience which is in sharp con- 


tradiction to his self concept. 


It is not easy to give precise- operational ot "Red 
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yet to some degree this has been achieved. 
Several research workers have defined the self 
concept by means of a Q sort by the indi- 
vidual of a list of self-referent items. This 
gives us an operational picture of the self. 
The total experiencing of the individual is 
more difficult to capture. Chodorkoff (2) has 
defined it as a Q sort made by a clinician who 
sorts the same self-referent items independ- 
ently, basing his sorting on the picture he has 
obtained of the individual from projective 
tests. His sort thus includes unconscious as 
well as conscious elements of the individual’s 
experience, thus representing (in an admit- 
tedly imperfect way) the totality of the cli- 
ent’s experience. The correlation between these 
two sortings gives a crude operational measure 
of incongruence between self and experience, 
low or negative correlation representing of 
course a high degree of incongruence. 


The Therapist’s Genuineness in the Relation- 
ship 


The third condition is that the therapist 
should be, within the confines of this rela- 
tionship, a congruent, genuine, integrated per- 
son. It means that within the relationship he 
is freely and deeply himself, with his actual 
experience accurately represented by his 
awareness of himself. It is the opposite of 
presenting a facade, either knowingly or un- 
knowingly. 

It is not necessary (nor is it possible) that 
the therapist be a paragon who exhibits this 
degree of integration, of wholeness, in every 
aspect of his life. It is sufficient that he is ac- 
curately himself in this hour of this relation- 
ship, that in this basic sense he is what he 
actually is, in this moment of time. 

It should be clear that this includes being 
himself even in ways which are not regarded 
as ideal for psychotherapy. His experience 
may be “I am afraid of this client” or “My 
attention is so focused on my own problems 
that I can scarcely listen to him.” If the 
therapist is not denying these feelings to 
awareness, but is able freely to be them (as 
well as being his other feelings), then the 
conditi é\stated is met. 
etaketus too far afield to consider 
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the therapist overtly communicates this re- 
ality in himself to the client. Certainly the 
aim is not for the therapist to express or talk 
out his own feelings, but primarily that he 
should not be deceiving the client as to him- 
self. At times he may need to talk out some 
of his own feelings (either to the client, or to 
a colleague or supervisor) if they are stand- 
ing in the way of the two following conditions. 
It is not too difficult to suggest an opera- 
tional definition for this third condition. We 
resort again to Q technique. If the therapist 
sorts a series of items relevant to the relation- 
ship (using a list similar to the ones devel- 
oped by Fiedler [3, 4] and Bown [1]), this 
will give his perception of his experience in 
the relationship. If several judges who have 
observed the interview or listened to a re- 
cording of it (or observed a sound movie of 
it) now sort the same items to represent their 
perception of the relationship, this second 
sorting should catch those elements of the 
therapist’s behavior and inferred attitudes of 
en he JS unaware, as well as those of 
yan he is aware. Thus a high correlation 
‘tween the therapist’s sort and the observ- 
T sort would represent in crude form an 
perational definition of the therapist’s con- 
me moa in the Telationship; 
relation, the opposite. 


Unconditional Positive Regard 

To the extent that the therapist finds him- 
self experiencing a warm acceptance of each 
aspect of the client’s experience as being a 


part of that client he is experienci 
ar encin - 
ditional positive r . ae 


the person, as 
It is at the op- 
evaluating atti- 


ceptance for the client’s expression of ne 
“bad,” painful, fearful, defensive B 

mal feelings as for his expression of “good » 

Positive, mature, confident, social feelings, aS 


much acceptance of ways in whic 


h h he is in- 
consistent as of ways in which he is consist- 
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ent. It means a caring for the client, but not 
in a possessive way or in such a way as sim- 
ply to satisfy the therapist’s own needs. It 
means a caring for the client as a separate 
person, with permission to have his own feel- 
ings, his own experiences. One client describes 
the therapist as “fostering my possession of 
my own experience . . . that [this] is my 
experience and that I am actually having it: 
thinking what I think, feeling what I feel, 
wanting what I want, fearing what I fear: no 

‘ifs,’ ‘buts,’ or ‘not reallys.’” This is the type 

of acceptance which is hypothesized as being 

necessary if personality change is to occur. 

Like the two previous conditions, this 
fourth condition is a matter of degree,® as 
immediately becomes apparent if we attempt 
to define it in terms of specific research op- 
erations. One such method of giving it defi- 
nition would be to consider the Q sort for the 
relationship as described under Condition 3. 
To the extent that items expressive of uncon- 
ditional positive regard are sorted as charac- 
teristic of the relationship by both the thera- 
pist and the observers, unconditional positive 
regard might be said to exist. Such items 
might include statements of this order: A 
feel no revulsion at anything the client says”; 
“I feel neither approval nor disapproval of 
the client and his statements—simply accept- 
ance”; “I feel warmly toward the client—to- 
ward his weaknesses and problems as well as 
his potentialities”; “I am not inclined to pass 
judgment on what the client tells me”; “I 
like the client.” To the extent that both 
therapist and observers perceive these items 
as characteristic, or their opposites as Un- 
characteristic, Condition 4 might be said to 
be met. 

3 The phrase “unconditional positive regard” may 
be an unfortunate one, since it sounds like an ab- 
solute, an all or nothing dispositional concept. It is 
probably evident from the description that ro 
pletely unconditional positive regard would never fai 
ist except in theory. From a clinical and experien S 
point of view I believe the most accurate statemen 
is that the effective therapist experiences uncondi- 
tional positive regard for the client during many mo- 
ments of his contact with him, yet from time to time 
he experiences only a conditional positive regard— 
and perhaps at times a negative regard, though this 
3s not likely in effective therapy. It is in this sense 


that unconditional positive regard exists as a matter 
of degree in any relationship. 
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Empathy 


The fifth condition is that the therapist is 
experiencing an accurate, empathic under- 
standing of the client’s awareness of his own 
experience. To sense the client’s private world 
as if it were your own, but without ever los- 
ing the “as if” quality—this is empathy, and 
this seems essential to therapy. To sense the 
client’s anger, fear, or confusion as if it were 
your own, yet without your own anger, fear, 
or confusion getting bound up in it, is the 
condition we are endeavoring to describe. 
When the client’s world is this clear to the 
therapist, and he moves about in it freely, 
then he can both communicate his under- 
standing of what is clearly known to the cli- 
ent and can also voice meanings in the cli- 
ent’s experience of which the client is scarcely 
aware. As one client described this second as- 
pect: “Every now and ‘again, with me in a 
tangle of thought and feeling, screwed up in 
a web of mutually divergent lines of move- 
ment, with impulses from different parts of 
me, and me feeling the feeling of its being all 
too much and suchlike—then whomp, just 
like a sunbeam thrusting its way through 
cloudbanks and tangles of foliage to spread 
a circle of light on a tangle of forest paths, 
came some comment from you. [It was] 
clarity, even disentanglement, an additional 
twist to the picture, a putting in place. Then 
the consequence—the sense of moving on, the 
relaxation. These were sunbeams.” That such 
penetrating empathy is important for therapy 
is indicated by Fiedler’s research (3) in which 
items such as the following placed high in the 
description of relationships created by experi- 
enced therapists: 


The therapist is well able to understand the pa- 


tient’s feelings, 
The therapist is never in any doubt about what 


the patient means. see i 
The therapist’s remarks fit in just right with the 


patient’s mood and content. 
The therapist’s tone of voice conveys the com- 


plete ability to share the patient’s feelings. 

An operational definition of the therapist’s 
empathy could be provided in different ways. 
Use might be made of the Q sort described 
under Condition 3. To the degree that items 
descriptive of accurate empathy were sorted 
as characteristic by both the therapist and the 


observers, this condition would be regarded 
as existing. 

Another way of defining this condition 
would be for both client and therapist to sort 
a list of items descriptive of client feelings. 
Each would sort independently, the task be- 
ing to represent the feelings which the client 
had experienced during a just completed in- 
terview. If the correlation between client and 
therapist sortings were high, accurate empathy 
would be said to exist, a low correlation indi- 
cating the opposite conclusion. 

Still another way of measuring empathy 
would be for trained judges to rate the depth 
and accuracy of the therapist’s empathy on 
the basis of listening to recorded interviews. 


The Client’s Perception of the Therapist 


The final condition as stated is that the cli- 
ent perceives, to a minimal degree, the ac- 
ceptance and empathy which the therapist 
experiences for him. Unless some communica- 
tion of these attitudes has been achieved, 
then such attitudes do not exist in the rela- 
tionship as far as the client is concerned, and 
the therapeutic process could not, by our hy- 
pothesis, be initiated. 

Since attitudes cannot be directly perceived, 
it might be somewhat more accurate to state 
that therapist behaviors and words are per- 
ceived by the client as meaning that to some 
degree the therapist accepts and understands 
him. 

An operational definition of this condition 
would not be difficult. The client might, after 
an interview, sort a Q-sort list of items re- 
ferring to qualities representing the relation- 
ship between himself and the therapist. (The 
same list could be used as for Condition 3.) 
If several items descriptive of acceptance and 
empathy are sorted by the client as charac- 
teristic of the relationship, then this condi- 
tion could be regarded as met. In the present 
state of our knowledge the meaning of “to a 
minimal degree” would have to be arbitrary. 


Some Comments 


Up to this point the effort has been made 
to present, briefly and factually, the condi- 
tions which I have come to regard as essen- 
tial for psychotherapeutic change. I have not 
tried to give the theoretical context of these 
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conditions nor to explain what seem to me to 

be the dynamics of their effectiveness. Such 

explanatory material will be available, to the 

reader who is interested, in the document al- 

ready mentioned (see footnote 1). 

I have, however, given at least one means 
of defining, in operational terms, each of the 
conditions mentioned. I have done this in or- 
der to stress the fact that I-am not speaking 
of vague qualities which ideally should be 
present if some other vague result is to occur. 
I am presenting conditions which are crudely 
measurable even in the present state of our 
technology, and have suggested specific op- 
erations in each instance even though I am 
sure that more adequate methods of measure- 
ment could be devised by a serious investi- 
gator. 

My purpose has been to stress the notion 
that in my opinion we are dealing with an 
if-then phenomenon in which knowledge of 
the dynamics is not essential to testing the 
hypotheses. Thus, to illustrate from another 
field: if one substance, shown by a series of 
operations to be the substance known as hy- 
drochloric acid, is mixed with another sub- 
stance, shown by another series of operations 
to be sodium hydroxide, then salt and water 
will be products of thi 
whether one rega 
magic, or whether 
adequate terms o 
In the same way 
that certain defin 


tain definable changes and that this fact ex- 


ai independently of our efforts to account 
or it, 


The Resulting Hypotheses 
The major value of i 

: Stating any theory in 
unequivocal terms is that specific hypotheses 
may be drawn from it which are capable of 
» even if the condi- 
Ostulated as necessary 
S are more incorrect 
Ope they are 


be of this order: 
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If these six conditions (as operationally de- 
fined) exist, then constructive personality 
change (as defined) will occur in the client. 

If one or more of these conditions is not 
present, constructive personality change will 
not occur. 


These hypotheses hold in any situation 
whether it is or is not labeled “psychother- 
apy.” 

Only Condition 1 is dichotomous (it either 
is present or is not), and the remaining five 
occur in varying degree, each on its con- 
tinuum. Since this is true, another hypothesis 
follows, and it is likely that this would be the 
simplest to test: 


If all six conditions are present, then the 
greater the degree to which Conditions 2 to 6 
exist, the more marked will be the construc- 
tive personality change in the client. 


At the present time the above hypothesis can 
only be stated in this general form—which 
implies that all of the conditions have equal 
weight. Empirical studies will no doubt make 
possible much more refinement of this hy- 
pothesis. It may be, for example, that if anx- 
iety is high in the client, then the other con- 
ditions are less important. Or if unconditional 
Positive regard is high (as in a mother’s love 
for her child), then perhaps a modest degree 
of empathy is sufficient. But at the moment 
we can only speculate on such possibilities. 


Some Implications 
Significant Omissions 


If there is any startling feature in the for- 
mulation which has been given as to the nec- 
essary conditions for therapy, it probably lies 
in the elements which are omitted. In pres- 
ent-day clinical practice, therapists operate as 
though there were many other conditions in 
addition to those described, which are essen- 
tial for psychotherapy. To point this up it 
may be well to mention a few of the condi- 
tions which, after thoughtful consideration of 
our research and our experience, are not in- 
cluded. 

For example, it is not stated that these con- 
ditions apply to one type of client, and that 
other conditions are necessary to bring about 
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psychotherapeutic change with other types of 
client. Probably no idea is so prevalent in 
clinical work today as that one works with 
neurotics in one way, with psychotics in an- 
other; that certain therapeutic conditions 
must be provided for compulsives, others for 
homosexuals, etc. Because of this heavy 
weight of clinical opinion to the contrary, it 
is with some “fear and trembling” that I ad- 
vance the concept that the essential condi- 
tions of psychotherapy exist in a single con- 
figuration, even though the client or patient 
may use them very differently.* $ 

Tt is not stated that these six conditions 
are the essential conditions for client-centered 
therapy, and that other conditions are essen- 
tial for other types of psychotherapy. I cer- 
tainly am heavily influenced by my own ex- 
perience, and that experience has led me to a 
viewpoint which is termed “client centered.’ 
Nevertheless my aim in stating this theory is 
to state the conditions which apply to any 
situation in which constructive personality 
change occurs, whether we are thinking of 
classical psychoanalysis, or any of its modern 
offshoots, or Adlerian psychotherapy, or any 
other, It will be obvious then that in my 
judgment much of what is considered to be 
essential would not be found, empirically, to 
be essential. Testing of some of the stated 
hypotheses would throw light on this per- 
plexing issue. We may of course find that 
various therapies produce various types of 
personality change, and that for each psycho- 
therapy a separate set of conditions is neces- 
sary. Until and unless this is demonstrated, I 

4I cling to this statement of my hypothesis even 
though it is challenged by a just completed study by 
Kirtner (5). Kirtner has found, in a group of 26 
cases from the Counseling Center at the University 
of Chicago, that there are sharp differences in the 
client’s mode of approach to the resolution of life 
difficulties, and that these differences are related to 
success in psychotherapy. Briefly, the client who 
sees his problem as involving his relationships, and 
who feels that he contributes to this problem and 
wants to change it, is likely to be successful. The 
client who externalizes his problem, feeling little self- 
responsibility, is much more likely to be a failure. 
Thus the implication is that some other conditions 
need to be provided for psychotherapy with this 
group. For the present, however, I will stand by my 
hypothesis as given, until Kirtner’s study is con- 
firmed, and until we know an alternative hypothe- 
sis to take its place. 
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am hypothesizing that effective psychotherapy 
of any sort produces similar changes in per- 
sonality and behavior, and that a single set 
of preconditions is necessary. 

It is mot stated that psychotherapy is a 
special kind of relationship, different in kind 
from all others which occur in everyday life. 
It will be evident instead that for brief mo- 
ments, at least, many good friendships fulfill 
the six conditions. Usually this is only mo- 
mentarily, however, and then empathy falters, 
the positive regard becomes conditional, or 
the congruence of the “therapist” friend be- 
comes overlaid by some degree of facade or 
defensiveness. Thus the therapeutic relation- 
ship is seen as a heightening of the construc- 
tive qualities which often exist in part in 
other relationships, and an extension through 
time of qualities which in other relationships 
tend at best to be momentary. 

It is mot stated that special intellectual 
professional knowledge—psychological, psy- 
chiatric, medical, or religious—is required of 
the therapist. Conditions 3, 4, and 5, which 
apply especially to the therapist, are quali- 
ties of experience, not intellectual informa- 
tion. If they are to be acquired, they must, 
in my opinion, be acquired through an ex- 
periential training—which may be, but usu- 
ally is not, a part of professional training. It 
troubles me to hold such a radical point of 
view, but I can draw no other conclusion from 
my experience. Intellectual training and the 
acquiring of information has, I believe, many 
valuable results—but becoming a therapist is 
not one of those results. 

It is not stated that it is necessary for psy- 
chotherapy that the therapist have an accu- 
rate psychological diagnosis of the client. 
Here too it troubles me to hold a viewpoint 
so at variance with my clinical colleagues. 
When one thinks of the vast proportion of 
time spent in any psychological, psychiatric, 
or mental hygiene center on the exhaustive 
psychological evaluation of the client or pa- 
tient, it seems as though this must serve a 
useful purpose insofar as psychotherapy is 
concerned. Yet the more I have observed 
therapists, and the more closely I have studied 
research such as that done by Fiedler and 
others (4), the more I am forced to the con- 
clusion that such diagnostic knowledge is not 
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essential to psychotherapy. It may even be 
that its defense as a necessary prelude to psy- 
chotherapy is simply a protective alternative 
to the admission that it is, for the most part, 
a colossal waste of time. There is only one 
useful purpose I have been able to observe 
which relates to psychotherapy. Some thera- 
pists cannot feel secure in the relationship 
with the client unless they possess such diag- 
nostic knowledge. Without it they feel fearful 
of him, unable to be empathic, unable to ex- 
perience unconditional regard, finding it nec- 
essary to put up a pretense in the relation- 
ship. If they know in advance of suicidal 
impulses they can somehow be more accept- 
ant of them. Thus, for some therapists, the 
security they perceive in diagnostic infor- 
mation may be a basis for permitting them- 
selves to be integrated in the relationship, 
and to experience empathy and full accept- 
ance. In these instances a psychological diag- 
nosis would certainly be justified as adding to 
the comfort and hence the effectiveness of the 
therapist. But even here it does not appear to 
be a basic precondition for psychotherapy.* 
Perhaps I have given enough illustrations 
to indicate that the conditions I have hy- 
pothesized as necessary and sufficient for psy- 
chotherapy are striking and unusual pri- 
marily by virtue of what they omit. If we 
were to determine, by a survey of the be- 
haviors of therapists, those hypotheses which 
they appear to regard as necessary to psy- 
chotherapy, the list would be a great deal 
longer and more complex. 


Is This Theoretical Formulation Useful? 


Aside from the personal satisfaction it gives 
as a venture in abstraction and generalization, 
what is the value of a theoretical statement 


5 There is no intent here 
nostic evaluation is useless, 
heavy use of such methods in Our research studies of 
change in personality. It is its usefulness as a pre- 
condition to psychotherapy which is questioned. 

®In a facetious moment I have suggested that such 
therapists might be made equally comfortable by be- 
ing given the diagnosis of some other individual, not 
of this patient or client. The fact that the diagnosis 
proved inaccurate as psychotherapy continued would 
not be particularly disturbing, because one always 
expects to find inaccuracies in the diagnosis as one 
works with the individual. 


to maintain that diag- 
We have ourselves made 
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such as has been offered in this paper? I 
should like to spell out more fully the useful- 
ness which I believe it may have. 

In the field of research it may give both 
direction and impetus to investigation. Since 
it sees the conditions of constructive person- 
ality change as general, it greatly broadens 
the opportunities for study. Psychotherapy is 
not the only situation aimed at constructive 
personality change. Programs of training for 
leadership in industry and programs of train- 
ing for military leadership often aim at such 
change. Educational institutions or programs 
frequently aim at development of character 
and personality as well as at intellectual skills. 
Community agencies aim at personality and 
behavioral change in delinquents and crimi- 
nals. Such programs would provide an oppor- 
tunity for the broad testing of the hypotheses 
offered. If it is found that constructive per- 
sonality change occurs in such programs when 
the hypothesized conditions are not fulfilled, 
then the theory would have to be revised. If 
however the hypotheses are upheld, then the 
results, both for the planning of such pro- 
grams and for our knowledge of human dy- 
namics, would be significant. In the field of 
psychotherapy itself, the application of con- 
sistent hypotheses to the work of various 
schools of therapists may prove highly profit- 
able. Again the disproof of the hypotheses of- 
fered would be as important as their confir- 
mation, either result adding significantly to 
our knowledge. 

For the practice of psychotherapy the the- 
ory also offers significant problems for con- 
sideration. One of its implications is that the 
techniques of the various therapies are rela- 
tively unimportant except to the extent that 
they serve as channels for fulfilling one of the 
conditions. In client-centered therapy, for oS 
ample, the technique of “reflecting feelings 
has been described and commented on (6, pp. 
26-36). In terms of the theory here being pre- 
sented, this technique is by no means an es- 
sential condition of therapy. To the extent, 
however, that it provides a channel by which 
the therapist communicates a sensitive em- 
pathy and an unconditional positive regard, 
then it may serve as a technical channel by 
which the essential conditions of therapy are 
fulfilled. In the same way, the theory I have 
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presented would see no essential value to 
therapy of such techniques as interpretation 
of personality dynamics, free association, 
analysis of dreams, analysis of the transfer- 
ence, hypnosis, interpretation of life style, 
suggestion, and the like. Each of these tech- 
niques may, however, become a channel for 
communicating the essential conditions which 
have been formulated. An interpretation may 
be given in a way which communicates the 
unconditional positive regard of the therapist. 
A stream of free association may be listened 
to in a way which communicates an empathy 
which the therapist is experiencing. In the 
handling of the transference an effective 
therapist often communicates his own whole- 
ness and congruence in the relationship. Simi- 
larly for the other techniques. But just as 
these techniques may communicate the ele- 
ments which are essential for therapy, so any 
one of them may communicate attitudes and 
experiences sharply contradictory to the hy- 
pothesized conditions of therapy. Feeling may 
be “reflected” in a way which communicates 
the therapist’s lack of empathy. Interpreta- 
tions may be rendered in a way which indi- 
cates the highly conditional regard of the 
therapist. Any of the techniques may com- 
municate the fact that the therapist is ex- 
pressing one attitude at a surface level, and 
another contradictory attitude which is de- 
nied to his own awareness. Thus one value of 
such a theoretical formulation as we have of- 
fered is that it may assist therapists to think 
more critically about those elements of their 
experience, attitudes, and behaviors which 
are essential to psychotherapy, and those 
which are nonessential or even deleterious to 
psychotherapy. : 
Finally, in those programs—educational, 
correctional, military, Or industrial—which 
aim toward constructive changes in the per- 
sonality structure and behavior of the indi- 
vidual, this formulation may serve as a very 
tentative criterion against which to measure 
the program. Until it is much further tested 
by research, it cannot be thought of as a 
valid criterion, but, as in the field of psycho- 
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therapy, it may help to stimulate critical 
analysis and the formulation of alternative 
conditions and alternative hypotheses. 


Summary 


Drawing from a larger theoretical contest, 
six conditions are postulated as necessary and 
sufficient conditions for the initiation of a 
process of constructive personality change. A 
brief explanation is given of each condition, 
and suggestions are made as to how each may 
be operationally defined for research purposes. 
The implications of this theory for research, 
for psychotherapy, and for educational and 
training programs aimed at constructive per- 
sonality change, are indicated. It is pointed 
out that many of the conditions which are 
commonly regarded as necessary to psycho- 
therapy are, in terms of this theory, non- 


essential. 
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Predictive Empathy and The Study of Values’ 


Howard M. Halpern 


Bronx VA Hospital 


In the most commonly employed measure 
of empathy, subjects are required to predict 
the rating behavior of others. The use of pre- 
dictions makes sense since it provides an op- 
erational measure of empathy that makes it 
a manageable concept. Perhaps the greatest 
procedural objection to the predictive method 
is that it requires a very special and cumber- 
some set of conditions—namely, the existence 
and cooperation of a group of acquaintances 
of the subject. There is now no adequate way 
to measure a person’s empathy individually. 

To construct an individual empathy test, it 
will be necessary to first know what kind of 
personality attributes correlate well with pre- 
dictive empathy. The purpose of this brief 
article is to put into the literature a correla- 
ae study that may be used toward that 
end. 

A sample of 37 female nurses was divided 
into four groups as previously reported (3). 
Their predictive accuracy in rating five fel- 
low group members on an 80-item inventory 
was determined. This was correlated with the 
Allport-Vernon-Lindzey Study of Values (1). 
The following correlations were found to hold: 
Social, 355; religious, -203; economic, .108; 
ier -060; theoretical, — 086; esthetic, 
m 938, 

The only correlations significant at the .05 
level were the positive correlation of predic- 
tions with Social Values and the negative cor- 
relation of predictions with Esthetic Values. 

The social type is summarized in the manual 


of directions to the Study of Values as fol- 
lows: “The highest value for this type is love 
of people. . . . The social man prizes other 
persons as ends, and is therefore himself kind, 
sympathetic and unselfish. . , . In its purest 
form the social interest js selfless and tends to 
approach very closely to the religious atti- 
tude” (2, p. 14). 

Since there have been studies indicating 
that psychologists with artistic interests are 
better empathizers than those without artistic 
interests, the negative correlation of predic- 
tive accuracy with esthetic values needs ex- 
planation. The manual is again instructive. 
“The Esthetic” is described as a man who 
“sees his highest value in form and harmony. 
Each single experience is judged from the 
standpoint of grace, symmetry, or fitness. 
- . » He need not be a creative artist. . . . In 
social affairs he may be said to be interested 
in persons but not in the welfare of persons 
a? (ip. 13). 
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The Validity of Judgments Based on “Blind” 
Rorschach Records’ 


Guinevere S. Chambers and Roy M. Hamlin 


Western Psychiatric Institute, University of Pittsburgh 


The sparsity of objective, controlled studies 
bearing on the question of whether the Ror- 
schach can validly identify clinical groups 1s 
brought into sharp focus by Ainsworth in the 
recent book on the Rorschach by Klopfer 
et al, (8). In an extensive review of validity 
research, she cites three studies as being the 
most significant on this point (1, 5, 9). The 
whole question of correlation between Ror- 
schach evidence and clinical groups is dis- 
missed in a rather unconcerned fashion. She 
is more interested in validating “underlying 
principles” and the inference is that no one 
with a true appreciation for projective pro- 
cedures would bother about such a question 
anyway. To quote her specifically: “all these 
‘blind’ interpretations and diagnoses seem to 
be more of a tour de force to impress the 
skeptic than to represent a serious attempt 
to test out the basic hypotheses upon which 
both interpretation and diagnosis are based” 

. 463). 
pa view A Ainsworth’s admission that other 
approaches to validation have not added 
much to the security of the clinician's posi- 
tion, it appears a bit early to dismiss the pro- 
cedure of correlating the Rorschach with the 
outside criterion of clinical groups on which 
there is at least some degree of agreement. 
While “underlying hypotheses” are admittedly 
important, they depend for their significance 
on the over-all validity of the technique as it 


is put to use. jf > 
Ihe the Rorschach is used in actual clinical 


1 This article is based on a dissertation submitted 
in partial fulfillment of the requirements for the de- 
of Philosophy, University of Pitts- 


ree of Doctor s 
burgh, Appreciation is expressed to Dr. A. W. Bendig 


for his advice on statistical techniques. 


practice, the clinician and the tool are an 
entity. Attempts to validate the Rorschach 
“with the interpreter attached” (7) have met 
with widely varying degrees of success. Ham- 
lin (4) in comparing 10 such studies con- 
cludes that the disparity in results is a func- 
tion of differences in methodology mainly re- 
lated to the size of units employed and the 
over-all complexity of the judgment task as- 
signed to the individual psychologist. 

The present experiment is designed to meet 
Hamlin’s conditions of presenting the cli- 
nician with an adequate sample of material 
pertinent to the judgment required (total 
Rorschach) and of keeping the judgment task 
from being too complex (a single judgment 
on each of five Rorschachs). Every effort was 
made to achieve a stable criterion against 
which to check the Rorschach and to clearly 
define and delimit the task presented to the 
psychologist judges. This study simply asks: 
(a) Can clinicians validly identify patient 
groups on the basis of “blind” Rorschachs? 
(b) Is there a difference in the Rorschach 
elements used as a basis for interpretation by 
clinicians with varying degrees of success on 
this task? 


Method 


Twenty psychologists were each asked to 
identify five Rorschachs according to clinical 
group. The clinical groups were limited to 
five, and the Rorschach judges were informed 
as to which groups were represented. The 
judges were also told that they would receive 
one record from each group. The five clinical 
groups were: (a) involutional depression; 
(b) anxiety neurosis; (c) paranoid schizo- 
phrenia; (d) brain damage from neurosyphi- 
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lis; (e) adult mental deficiency. Except for 
serving as an economical means of communi- 
cation in designating five distinct types of 
disorders, diagnosis per se was of minimal 
importance in this study. In selecting cases, 
a major objective was to employ selection cri- 
teria which would maximize similarity of be- 
havior within groups and minimize similarity 
between groups. 

All of the individuals whose Rorschach 
protocols constitute the raw data of this in- 
vestigation had been patients at Western Psy- 
chiatric Institute and Clinic of the Univer- 
sity of Pittsburgh.” 

A careful examination was made of case 
history material, medical findings and prog- 
ress notes of several hundred cases on file, 
bearing one of the five psychiatric diagnoses 
here considered. Patients were selected who 
met specific behavioral criteria for each group. 
For example, the case record of each of the 
neurotic patients was carefully studied to 
guarantee that the following criteria were 
met: no indications of schizophrenic person- 

ality features such as bizarre thinking or loss 
of affect; marked discomfort and incapacita- 
tion from anxiety feelings as the Prominent 
features of the illness; psychiatric treatment 
administered on an outpatient basis; super- 
ficial adaptiveness shown, i.e., did not distort 
reality and attempted to adjust to social de- 
mands; a history of use of ineffectual de- 
fenses against anxiety which resulted in so- 
matic complaints; and nontest evidence of 
marked anxiety at time of testing. Thus, the 
Rorschach in no way biased selection of cases, 
All cases originally selected on this basis were 
used in the study; none was rejected after 
the study began. 

All protocols were scored and identified 
only by randomly selected code numbers and 
sex. The age of the patient was not recorded 
on the protocol as it might serve as a clue in 
the identification of Particular groups. 

Twenty sets of five Rorschachs each were 
Prepared for distribution to twenty judges. 
Each judge was to judge five records; each 
record would be judged four times Yielding a 
total of 100 judgments. Selection of each rec- 

? The five mental defectives were imbeciles, tempo- 


tarily transferred for a research project from Polk 
State School, Polk, Pennsylvania, 
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ord for a given set of five was made on a 
chance basis. Letters were sent to 35 psy- 
chologists from all parts of the country who 
were known to have had at least three years 
of clinical experience which included the use 
of the Rorschach technique. Final selection 
of the 20 judges was determined by the order 
in which replies were received. The judge’s 
task was to identify each of his five Ror- 
schachs according to which one of the five 
Possible groups he felt it belonged. In addi- 
tion, each judge was asked to make four state- 
ments summarizing elements of major impor- 
tance influencing his thinking in arriving at 
his decision on each record. Judges were en- 
couraged to verbalize such “unscientific” rea- 
sons as “hunches” if they felt that the judg- 
ment had been reached by just such a method. 


Results 
Judgment Task 


The judgment task is a forced-choice situa- 
tion and judgments do not represent inde- 
pendent events, i.e., once a choice is made, 
the chances for success on the next choice are 
greater and so on throughout the series. 
Dudek (3) has coristructed a frequency dis- 
tribution of scores that might be obtained by 
chance in such situations, Expected frequen- 
cies derived from this table were used and 
the chi-square technique applied to the data. 
The number of judges making two or more 
correct judgments were entered in one cell; 
those making one correct judgment in the 
second; and those having no successes in the 
third. The chi-square value of 30.59 is highly 
significant. 

From the obtained results it may be con- 
cluded that trained Rorschach workers can 
identify “blind” Rorschachs according to 
known clinical groups significantly better than 
could occur by chance. Table 1 presents the 
correct and incorrect classifications for each 

8 The writers wish to thank the following who so 
graciously served as judges: Doctors Lawrence M, 
Baker, Marianne Beran, David Cohen, Gordon 
Filmer-Bennett, Bernice Gurvich, Frederick J, Heim- 
lich, Joseph S. Herrington, Bruno Klopfer, Kate L. 
Kogan, William S. Kogan, Janet M. Lyon, Karen 
Machover, Charles F. Mason, Gerald R. Pascal, 
Zygmunt Piotrowski, Alan K. Rosenwald, James C. 


Stauffacher, John W. Whitmyre, Miss Eleanor M. 
Rose, and Major Wendell R. Wilkin. 
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Table 1 
Correct and Incorrect Classifications Made by Judges 
Clinical group judged 
1 2 3 4 5 
Schizo- Brain 
Actual clinical group Depression Neurosis phrenia damage Deficiency 
1. Involutional depression 10 4 J: 4 1 
2. Anxiety neurosis 3 il 4 2 0 
3. Paranoid schizophrenia 4 5 8 3 0 
4. Brain damage 1 0 7 ll 1 
5. Mental deficiency 1 0 0 1 18 


clinical group. Five judges of the 20 had com- 
plete success. While the chi-square value 
(30.59) is highly significant, the findings are 
possibly more meaningful when considered in 
terms of probability. In such a judging situa- 
tion, only one judge in 120 would be expected 
to make five correct choices by chance. The 
probabilities for 20 judges taken by groups 
are: two or more correct judgments, an ex- 
pectancy of 5.2 judges; one correct judgment, 
7.5 judges; and no successes, 7.3 judges. For 
our judges, 16 had two or more successes; 
two had one correct judgment; and two had 
no successes. Stated in terms of total judg- 
ments correct, of the 58, 25 were made by 
five judges; 27 by nine judges; and 6 by six 
judges. 

An analysis of variance was made to deter- 
mine if the variance in judgments was a func- 
tion of (a) variation in validity among judges 
or (b) specific clinical groups being more or 
less distinct than others. To effect this analy- 
sis, successes were assigned a score of one and 
failures were entered as zero. This analysis 
follows the technique of Hoyt (6) for esti- 
mating test reliability from consistency of in- 
dividual performances upon the items of a 
test. ; 

A significant difference between judges 
(.01) was to be expected since it was known 
that judges varied in success from five cor- 
rect judgments to no correct judgments. The 
aspect of this analysis in which we were con- 
cerned deals with the question of whether 
there were significant differences in difficulty 
among categories. The F ratio computed on 
this source of variation was significant at the 
01 level. To test whether the extreme mean 


for the mental defective group contributed sig- 
nificantly more than the other group means to 
the obtained F ratio, the test for extreme 
mean recommended by Dixon (2) was ap- 
plied. This yielded a value significant at the 
.05 level, allowing the conclusion to be drawn 
that the mental defective group did differ sig- 
nificantly from the others in the direction of 
being more readily identifiable. 

A further analysis was made to determine 
whether specific records may have been espe- 
cially misleading. It was found that no record 
was misjudged all four times. The highest 
single occurrence of certain categories being 
consistently interchanged was for the organic 
and paranoid groups. The organic records 
were misjudged as paranoid seven times out 
of the total of 20 judgments. This is of con- 
siderable interest in that the presence of delu- 
sions was one of the selection criteria for each 
of these groups. 


Stated Reasons for Making Judgments 


The study of the reasons which the judges 
gave for their decisions was regarded as an 
exploratory procedure. From inspection of the 
reasons given by judges for making judgments 
many methods for classification are suggested. 
By considering the statements of the group 
of judges making five correct judgments (suc- 
cessful judges) in contrast to the group mak- 
ing one or no correct judgments (unsuccess- 
ful judges) certain elements of difference are 
immediately apparent. Most striking is the 
difference in the length of statements. Suc- 
cessful judges use fewer words to communi- 
cate their thinking than do the unsuccessful. 
In attempting to analyze what this difference 
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represents in terms of the thinking employed, 
the emerging impression was that the success- 
ful judges tend to reach a higher level of 
abstraction from the raw data than do un- 
successful judges. On the basis of the ob- 
servation of the varying degrees of abstract- 
ness of the statements, two workers arrived 
at a three-point scale for classifying judges’ 
statements according to levels of abstract- 
ness. Representative statements were selected 
for each point of the scale. The scale pro- 
gressed from Level 1 to 3 in the direction of 
“distance” from the raw data. Statements in 
Level 1 were those where the evidence was 
presented in strictly Rorschach scoring terms. 
Level 2 was reserved for statements where 
evidence from many sources was cited, but no 
generalization was drawn. Level 3 included 
statements where (a) an over-all appraisal of 
the record was made; (b) a generalization 
was drawn; and (c) the reasoning was ex- 
pressed in general clinical terms rather than 
in Rorschach terminology. 


Examples of rated statements from the three- 
Point scale are: Level 1 statement—‘No M”; Level 
2—paranoid—Some loss of distance on IV, with 
an emotional reaction as if the picture were real”: 
Level 3—paranoid—defective reality-testing com- 
bined with blocking and defensive Over-caution,” 


percentages to facilitate 


comparison. The necessary correction for the 


Table 2 


Significance of Difference Betwee; 
n Rea: i 
Successful and Unsuccessful ET 2 


Per cent of responses 
by levels 


Number = a 
Judges of judges Level1 Level2 Level3 
Successful 14.9 
Unsuccessful 6 22.4 ara aes 
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use of percentages was applied. It can be seen 
from the table that the statements of success- 
ful judges fell in Levels 1 and 2 less fre- 
quently than did those of the unsuccessful 
judges, and that 61% of successful judges’ 
statements were in Level 3 as contrasted to 
30% of those of unsuccessful judges. It was 
the impression of the two raters that indi- 
vidual judges from the unsuccessful group 
tended to follow one approach—Level 1, 2, 
or 3—more rigidly than did successful judges, 
although this cannot be demonstrated sta- 
tistically. Successful judges would shift from 
one level of abstractness to another in evalu- 
ating a given Protocol, suggesting a greater 
degree of adaptiveness and ability to be se- 
lective on the part of these judges in deciding 
what is pertinent in a given record. 


Discussion 


The results of this study justify the conclu- 
sion that some experienced Clinicians, on the 
basis of total Rorschach protocols, can iden- 
tify rather clear-cut patient groups with a de- 
gree of success better than chance, In evaluat- 
ing this conclusion, consideration should be 
given to the conditions of the experiment re- 
ported here: the choice called for was re- 
stricted to five Categories, and other details 
of procedure were highly favorable to correct 
judgments. 

The judges did indeed attain a high degree 
of success in identifying the Rorschachs of 
adult imbeciles. On the other hand, they were 
tight only half of the time in distinguishing 
between depression, neurosis, paranoid schizo- 
phrenia, and brain damage. This degree of 
success is certainly not impressive enough to 
justify expansive claims for the value of the 
Rorschach as a technique in identifying pa- 
tient groups. 

Of the twenty judges in the study, five 
succeeded in identifying correctly all five 
protocols submitted to them. Four judges 
missed in all, or in all but one, of the choices 
they made. The study does not justify any 
firm conclusions as to the consistent differ- 
ential ability of various judges. In all prob- 
ability, some judges are better than others, 
Tentative comparisons, offered only as pos- 
sible leads, were made between those judges 
who seemed most successful, and those who 
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seemed least successful. These comparisons 
suggest, only as possible questions for future 
consideration, that successful judges: (e) 
have had recent experience in interpreting 
“blind” Rorschachs, where the Rorschachs 
were actually administered by someone else; 
(b) show considerable flexibility in shifting 
from one level of interpretation to another; 
and (c) tend to be free of slavish adherence 
to textbook statements in regard to scores, 
“sions,” etc.; but rather judge in terms of 
second or third level inferences related to 
over-all concepts of psychopathology. None 
of these suggestions are actually tested in the 
experiment reported. 


Summary 


This study has investigated the validity of 
judgments made by clinicians on the basis of 
“blind?” Rorschach records and an analysis 
was made of the thought processes of cli- 
nicians when making these judgments. 

Each of 20 clinicians, experienced in the 
use of the Rorschach technique, was given 
the task of identifying five Rorschach proto- 
cols according to clinical group. The groups 
were: (a) involutional depression; (2) para- 
noid schizophrenia; (c) anxiety neurosis; (d) 
brain damage due to syphilis; and (e) adult 
mental deficiency. The judges were told which 
five clinical groups were represented and that 
they would receive one Rorschach record from 
each group. The secondary phase of the 
study, concerning the thought processes of 
the judges, was a pilot investigation. Each 
judge was asked to make four statements 
summarizing elements of major importance 
influencing his thinking in arriving at the 
decision on each record. A method for clas- 
sifying these statements was devised by be 
paring statements of the five most successiu 


judges with those of the six least successful 


judges. 


The following major conclusions were 


109 


reached with reference to this sample of Ror- 
schach records and these judges: 

1. Some clinicians can identify “blind” Ror- 
schach records according to clinical groups, 
when cases are selected for homogeneity and 
groups are limited. Out of 100 possible judg- 
ments, clinicians were correct 58 times; five 
judges contributed 25 of the correct judg- 
ments; nine judges, 27; and the remaining six 
judges, 6. j 

2. Rorschachs of mental defectives can be 
identified in 90% of the cases. Judges were 
correct 51% of the time in distinguishing be- 
tween depression, neurosis, paranoid schizo- 
phrenia, and brain damage. 

3. The method proposed for analyzing the 
thought processes of clinicians working with 
the Rorschach indicated that there is a sig- 
nificant difference between the approaches of 
successful and unsuccessful judges. 


Received July 19, 1956. 
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A Comparison of Client and Therapist Ratings 
on Two Psychotherapeutic Variables’ 


Malcolm H. Robertson 


Purdue University 


This study was designed to determine the 
extent of agreement between clients and thera- 
pists about changes during psychotherapy. 
Howard and Kelly (1) have suggested that 
anticipation of change may lead an indi- 
vidual to exaggerate any actual change. Con- 
sequently, observers may note little if any be- 
havioral change, but the individual reports 
far more change because he interprets what 
has occurred in terms of his anticipation of 
greater change. Thus, we would expect agree- 
ment between client and therapist when the 
client indicates no change. We would not 
expect agreement when the client indicates 
change. However, if both acknowledge that a 
change has occurred, we would expect them 
to agree on whether they are satisfied with 
the change. Where both acknowledge that a 
change has not occurred, we would not ex- 
pect them to agree on whether they are 
satisfied. 

A questionnaire of 12 statements related to 
changes in feelings or interpersonal behavior 
was administered to 23 clients and 16 thera- 
pists from two mental hygiene clinics. Each 
subject answered yes or no to whether there 
had been a change in a certain direction. The 
Tesponses of each client and therapist pair 
were compared by means of chi square. The 
data were collected while the clients were still 
receiving psychotherapy. 


The results show that when clients indicate 
no specific change, there is a strong trend, 
though not statistically significant, for Gan 
pists to agree with them. When clients in E 
cate a specific change, there is no trend toa 
ward agreement. This finding is conski 
with the idea that clients in anticipating ee, 
change may be led 44., iggerate any an 
change. Where no enange has occurred, su fi 
exaggeration effects would not -be ce. 
Moreover, when clients and therapists ag" is 
that certain changes have: Gccurred, ther a 
also agreement (p = 05) regarding satis tal 
tion over these changes, When they agree t % 
certain changes have not taken place, theri a 
no trend toward agreement about the si 
faction over the absence of these chanet 
One explanation for this finding might a 
that as change occurs, both client and ther ‘ 
pist may note the actual effects of this age 
in the client’s adjustment. If no change ae 
curs, their evaluations may be based on 
ferent anticipations. ee: 

We realize that a complete evaluation A 
such changes in psychotherapy should ve 
into account not only the degree of cha S 
but also some estimate of the permanen 


of change. 


Brief Report. 
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Studies in Fantasy—Daydreaming Frequency 
and Rorschach Scoring Categories’ 


Horace A. Page* 


University of Wisconsin 


Fantasy has traditionally assumed a role of 
importance in tests of a projective nature. 
The Rorschach has figured prominently among 
those tests which are considered capable of 
eliciting information about an individual’s 
tendency to engage in fantasy. In this study, 


the interest was directed to a consideration 


of the relationship between various formal 
aspects of Rorschach test performance and 
an independent assessment of the frequency 
of daydreaming behavior. 

Special attention is addressed to Rorschach 
movement response. Writers have typically 
stressed the relationship between movement 
responses, particularly human movement or 
M, and the extent to which S is free to uti- 
lize his imaginal processes. Klopfer makes 
this point in Developments in the Rorschach 
Technique (6, P- 256), while Beck makes an 
even stronger statement in the second volume 
of Rorschach’s Test where he states, 
Rorschach’s penetration to the essence of the 
movement response (M ) as fantasy activity 
is his greatest achievement—original as is his 
contribution to the many-sided instrument he 
fashioned. His M opens to investigation a 
sector of personality that, effective as it is in 


determining an individual's course, has been 
elusive of efforts to study it objectively” (1, 
p. 22). In this research movement responses 


are emphasized, but other formal character- 


1 This paper was presented at the 1956 meeting of 
the Midwestern Psychological Ae soni 
2 The author is indebted to Miss Gloria Markowitz 
and Mr. Conrad Nuthmann for their assistance in 
analysis of the data and to Dr. Richard M. Lundy 
for his helpful comments. This research was sup- 
k lied by the Research 


ported in part by funds supp. 3 b 
Catit af the University of Wisconsin Graduate 


School. 


istics of Rorschach performance were also 
considered. 


Method 


Assessment of daydreaming behavior. A 
Fantasy Scale of 201 items was administered 
to the Ss. Items in the scale represented dif- 
ferent imaginal themes, the majority of which 
were obtained from a large number of anony- 
mous reports of personally experienced day- 
dreams, submitted by 150 male and female 
college students. Table 1 presents some of the 
items from the scale. The Ss were asked to 
specify the frequency with which each fan- 
tasy was experienced on a five-point scale 
which ranged from 1 (Never experienced such 
a fantasy) to 5 (Very Frequently, experi- 
enced such a fantasy once a day or more fre- 
quently). A Productivity Score was obtained 
by simply summing the values assigned by S 
to those items which referred to various fan- 
tasies. The inventory was administered to 
groups, Ss recording their answers on a multi- 
ple-choice IBM form. 

Subjects. Eighty sophomore, junior, and 
senior women obtained from an introductory 
psychology course at the University of Wis- 
consin were administered the Fantasy scale. 
Fantasy scales were scored for productivity, 
and Ss in the upper and lower 25% of the 
distribution were selected and designated as 
the high and low daydreaming groups. Table 2 
shows that these groups are relatively com- 
parable in such characteristics as age and col- 
lege class with the exception of a greater 
heterogeneity in the ages of the high fre- 
quency group. 

Rorschach test. The Ss in the high and low 
daydreaming groups were administered the 
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Table 1 


Illustrative Items from the Fantasy Scale 


34. In my fantasies, I am at a social gathering where 
I have a very good time. 

43. In my fantasies, someone wise and understanding 

solves my problems. 

84. In my fantasies, I picture what it would be like 
to be different animals. 

104. I daydream that I defeat a rival and win out in a 
romance. 

128. In my fantasies, I am a machine-gun and mow 
down enemy troups. 

160. In my fantasies, I picture what it would be like 
if I were the only one left on earth. 

168. I imagine that I am in a far away place where I 
have nothing to do but bask in the sun, eat good 
food, and enjoy life. 

194. In my fantasies, I imagine what it would be like 
if I were physically disabled. 


Group Rorschach using the slides and forms 
prepared by Harrower (4). From 10 to 15 
Ss were tested at one time under conditions 
which assured similarity of position in rela- 
tion to the screen and which reduced the pos- 
sibility of communication between Ss, 
Rorschachs were scored by the author using 
the system described by Klopfer and Kelley 
(7). The reliability of the author’s scoring 
behavior was assessed in another study and 
had been considered acceptable (9). In addi- 
tion to these scoring procedures, a Rorschach 
content analysis for anxiety and hostility was 
made as described by Elizur (3), and move- 
ment Tesponses received an additional special 
scoring. For this latter evaluation, movement 
responses were scored only when Ss clearly 
verbalized movement in their initial exposure 


Table 2 


Age and College Class of Subjects in the High and 
Low Daydreaming Groups 


High Low CR Difi. 
Age Mean 19.60 1915 115 NS 
o 1.43 85: 2 <.05 
Range -1824 in 7"? 
College Mean 2.55 2.20 1 
. . 32 S 
year* o .67 90 1.21 NS 
Range 2-4 14 
cae ganding = 1, sophomore = 2, junior = 3, and 
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to the blots. Popular percepts as described by 
Klopfer and Kelley (7) were not included. It 
was felt that such movement scoring criteria 
might provide a measure which would be 
more sensitive to differences in daydreaming 
frequency. 


Results 


Initially a test was made of the absolute 
response number (R) obtained by the two 
groups. A mean R of 38.03 was obtained for 
the high group and a mean R of 28.21 was 
observed in the low group, a difference sig- 
nificant beyond the .05 level. This finding 
made it necessary to account for the relation- 
ship of all measures to response number prior 
to testing their discriminative power betwee? 
the two daydreaming groups. Table 3 con- 
tains the results of analysis of variance tests 
between high and low daydreaming Ss who 
were matched on the basis of Rorschach pro- 
ductivity. M, combined movement, F, and 
animal responses were considered in this fash- 
ion. None of the tests reveal significant dif- 
ferences attributable to daydreaming fre- 
quency. The importance of controlling for 


Table 3 


Analysis of Variance Tests of Ten High and Ten 
Low Daydreaming Rorschachs Matched 
for Response Number 


Mean 
High 


Mean 


Low Dif. 


Scoring category 


Human Movement 


Daydreaming Groups 7.4 6.9 F 
Productivity (R) 5 
Daydreaming X Productivity -z 


Combined Movement 
(M + FM +m) 


Daydreaming Groups 14.3 12.9 -i 
Productivity (R) ar! — 
Daydreaming X Productivity 

Form Determined Responses 
Daydreaming Groups 3.1 3.5 — 


Productivity (R) ae 05 
Daydreaming X Productivity kE 


Animal Responses 


Daydreaming Groups 13.6 12.3 — 
Productivity (R) .05 
Daydreaming X Productivity — 
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Table 4 


Results of ¢ Tests Between High and Low 
Frequency Daydreaming Groups 


Mean Mean 
High Low P of 
Rorschach Measure Group Group Difi. 
Response Number (R) 38.03 28.21 <.05 
Total Words 313.6 296.8 = 
Words per Response 11.1 12.5 age 
Anxiety Content 7.7 5.4 <10 
Hostility Content 6.9 5.9 AE 


is evidenced in the significant productivity 
effects seen for F and animal responses. 
Table 4 includes ż tests between the high 
and low groups. Although the significant dif- 
ference in response number occurs, similar 
differences are not noted for either the total 
number of words or the average number of 


words per response. The Elizur anxiety and 
hostility ratings are in a direction indicative 
of a greater incidence of these characteristics 
in the high frequency daydreaming group, but 
these differences are not statistically reliable. 

An analysis of the results of the special 
scoring procedure for movement responses 1S 
presented in Table 5. When scored with a 
somewhat more demanding criterion, it will 
be noted that the incidence of M responses 
does differ in the two groups, with the fre- 
quent daydreaming Ss showing a greater 


Table 5 


Movement Responses (Special Scoring) for High 
and Low Daydreaming Groups 


Mean Mean 


Movement 
Response High Low ra 
Categories Group Group iff. 
3.71 1.76 <.05 
7M 2.05 1.70 <.10 
m 1.70 1.00 no test 
i .70 12 no test 
rate H .29 06 no test 
M, FM, or M 
CP PARR 1.41 53 no test 
Minus Responses 18 eek 


mean number of such percepts. The low in- 
cidence of responses in some of the other 
categories considered precluded the applica- 
tion of statistical tests, but it is of note that 
the high daydreaming group shows a greater 
number of movement responses which appear 
in human details, are located in unusual blot 
areas, and are more typically of poor or 
minus form quality. 


Discussion 


A variety of tests have been made of the 
relations of certain aspects of performance on 
the Rorschach to the frequency of daydream- 
ing behavior as indicated by a “self-report” 
instrument. It is of interest that the hypothe- 
ses regarding movement responses are, to some 
extent, substantiated. A significant difference 
was obtained between the two daydreaming 
groups in the incidence of human movement 
responses when popular percepts were elimi- 
nated and a more rigorous scoring criterion 
was introduced. These findings provide sup- 
port for the notion that the tendency to per- 
ceive movement in the Rorschach is associ- 
ated with fantasy activity. In addition, there 
are qualitative indications suggestive of a 
tendency for the frequent daydreamer to per- 
ceive movement in partial human figures, in 
unusual locations, and with form of lower or 
minus quality. 

Although not necessarily predicted by Ror- 
schach theory, it is of interest to note that the 
two groups differ in number of responses. To 
have ignored this finding would have resulted 
in the determination of differences in other 
scoring categories. As Cronbach has sug- 
gested, however, it is more parsimonious to 
assume that R is playing a causal role (2). 
The number of words and the number of 
words per response did not differentiate the 
two groups. In a somewhat similar analysis 
of TAT data, daydreaming frequency was not 
found to be related to the amount of verbali- 
zation (8). 

In conclusion, it can be suggested that there 
are some relationships between daydreaming 
behavior and performance on the Rorschach 
test. These data are consistent with Rorschach 
theory in the sense that positive findings are 
noted for movement responses. These results, 
however, are not of a magnitude which would 
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warrant their employment in the interpreta- 
tion of the individual protocol. In support of 
the Rorschach it should be recognized that 
the Fantasy scale as a self-report measure is 
subject to certain limitations. 
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Levels of Prediction from the TAT 


Seymour Fisher and Robert B. Morton 
VA Hospital, Houston, Texas 


The literature is replete with studies and 
speculations concerning the kinds of phe- 
nomena which can be predicted from projec- 
tive test responses. Summaries of this mate- 
rial are available elsewhere (4, 7, 9). It is 
somewhat confusing to examine the evidence 
concerning the validity of projective tests be- 
cause it is so contradictory. One investigator 
reports great success in predicting various be- 
haviors from the Rorschach or TAT and an- 
other investigator reports completely negative 
results. These divergences are in many in- 
stances due to obvious differences in subject 
populations used and to variations in pro- 
cedure. Thus, it is clearly easier to predict in 
a heterogeneous group than in one that is 
highly selected. Likewise, it is clear that some 
projective test indices are more cleverly de- 
vised than others and therefore give more 
valid results. However, aside from such ob- 
vious factors, there are important differences 
in results which seem to be a function of the 
area of behavior one attempts to predict. 

Kagan and Mussen (4) have pointed out 
that past studies have found less significant 
relationships between fantasy and behavior 
that is prohibited or punished in the individu- 
al’s social milieu than between fantasy and 
behavior that is culturally sanctioned. This 
point is illustrated by the fact that various 
researchers (1, 5, 8) have not found signifi- 


cant relationships between amount of aggres- 


sive TAT or doll-play fantasy and degree of 
overt aggression among subjects from a mid- 
dle-class milieu. But Mussen and Naylor (7) 
demonstrated a significant link between TAT 
aggressive fantasy and overt aggressive be- 
havior in a group of Jower-class boys for 
whom this sort of behavior is more likely to 


be approved. Apparently, if the individual is 


set to conceal certain aspects of his behavior, 


this decreases the correlation ofsuch behav- 
ior with logically related areas of fantasy. 

In an analogous vein, might one not an- 
ticipate differences in fantasy vs. behavior re- 
lationships as a function of other behavioral 
dimensions? Is verbal behavior easier to pre- 
dict from fantasy than nonverbal behavior? 
Is behavior over which the individual has no 
conscious control easier to predict than be- 
havior which he can consciously influence? 
Are certain kinds of verbal behavior less diffi- 
cult to predict than others? Are behaviors 
that are usually conceptualized in purely 
physiological terms more or less predictable 
than behaviors occurring at the level of ver- 
balization and striate muscular response? The 
present study represents an attempt to answer 
some of these questions. More specifically, 
the intent was to determine the relationships 
of two different TAT scores to a whole range 
of behaviors which had been measured in a 
population of individuals who were hospital- 
ized for tuberculosis. d 


Methods 

Behavioral Measures 

The opportunity for examining such issues 
was provided in terms of a body of data 
which was collected by Moran, Fairweather, 
Morton, et al. (2, 6). As part of a large-scale 
study of the adjustment of patients with tu- 
berculosis, they obtained a wide variety of 
measures on a group of 140 male veterans 
who were receiving treatment for tuberculosis 
in a Veteran’s Administration hospital. The 
methods used in selecting this population, 
and the behavioral measures obtained, have 
already been described in detail elsewhere 
(2). Therefore, they will only be briefly 


listed and summarized. These measures were 
of the following order: 
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A. Verbal responses from each patient concerning 
his immediate attitudes toward the hospital situa- 
tion. There were various items which touched on his 
feelings about ward regulations, ward personnel, and 
other patients. The items were expressed in terms of 
statements with which the patient could agree or 
disagree. Scoring of responses was based on an 
a priori judgment as to whether agreement or dis- 
agreement with a particular item represented an 
adaptive or maladaptive attitude toward the hos- 
pital situation. By and large, adaptive attitudes were 
equated with wanting to conform to regulations and 
expressing positive reactions toward personnel and 
other patients. 

B. Verbal responses to a series of questions con- 
cerning prehospital adjustment. These questions con- 
cerned such a range of things as school attendance, 
number of close friends, and social achievement. An- 
swers to questions were scored in terms of a priori 
judgments concerning which of the reported behay- 
iors are adaptive vs. maladaptive. Thus, getting into 
fights, having few friends, and belonging to few or- 
ganizations are examples of behaviors which would 
be scored in the maladaptive direction. 

C. Verbal responses to a series of questions con- 
cerning the characteristics of the patient’s original 
family orientation. The questions diversely concerned 
such topics as parents’ economic status, parents’ edu- 
cational status, and Parents’ mode of disciplining 
children. Answers were scored relative to a priori 
standards of what is adaptive and maladaptive. Illus- 
tratively, adaptive scores were given for reports of 


high parental economic status and high parental oc- 
cupational attainment, 


of questions con- 
S current adjust- 
on. Patients were 
s as their antici- 
‘covered from their 
ndents toward their 
considered to be 
they indicated that the im- 
taken care of economically 
Positive accepting attitude 
italization. 


to regulations (eg. staying in 
relations with ward personnel 
or convivial), 
hazing and disturbing others), Each 
fered two alternatives, 


er patients (e.g., 
rating item of- 
w one of which was judged on 
an a priori basis to be adaptive. The score for a spe- 


cific item was the number of times the adaptive al- 
ternative was checked by all four raters. Oni 


adaptive response was considered to be in ¢ 
tion of obeyin; 
others, 


F. The ability of the Patient to remain in the hos- 
Pital for the full period required to complete treat- 


ce again, 
c he direc- 
g regulations and getting along with 
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ment. Some patients find it difficult to tolerate the 
demands made upon them as the result of living m 
a tuberculosis treatment ward. Such patients zi 
leave the hospital despite the fact that they are still 
sick and will seriously endanger the health of mem- 
bers of their family if they return home prematurely. 
It is obviously a maladaptive response to leave the 
hospital in this fashion and it provides a clear-cut 
index of poor adjustment to the hospital situation. 

G. Rate of recovery from the tubercular infection. 
This rate of recovery variable was defined in terms 
of the length of time required for the patient to con- 
vert from positive to negative bacteriology. Each 
patient’s sputum and gastric contents are routinely 
checked at intervals to determine their bacteriologi- 
cal status. A negative report means the absence % 
tubercle bacilli in the laboratory specimen, Within 
the context of this study, the patient was considered 
to be “converted” from Positive to negative bacterio- 
logically after five successive negative laboratory 
reports. Only 46 patients were involved in this phase 
of the study. They were subjects who had been care- 
fully selected to Participate in a national research 0? 
the effectiveness of chemotherapy. Patients who con- 
verted bacteriologically within five to eight months 
were designated a fast recovery group and patients 
who did not convert within this time period wet? 


designated as a slow recovery group. 


The various measures listed above provided 
samples of many different kinds of behavior 
of a group of men who were of lower- tO 
lower-middle socioeconomic status; who coul 
be considered part of the normal population 
in that they did not exhibit an unusual irea 
quency of personality disorders; and who 4 
the time evaluated were living in a similar 
standardized environment. 

For the purposes of the present study 2 
number of individual measures were derive 
from the total available array. These indi- 
vidual measures were selected so as to COD 
form to a particular conceptual scheme. 
was postulated that any behavior to be pre 
dicted may be categorized on a continuum 
having to do with how easily the individual is 
consciously able to camouflage that behavior 
so as to make it appear socially acceptable or 
how motivated he is to do so. In illustration, 
there might be cited at one extreme the rate 
of bacteriological conversion or the long-term 
ward behavior of a tubercular patient who js 
intensively observed by nurses and aides. A 
patient cannot consciously influence his rate 
of bacteriological conversion. It is doubtful 
also that a patient could for long dissimilate 
sufficiently to prevent nurses and aides who 


J 


= 


re a 


Levels of Prediction from the TAT 


observed him intimately from detecting his 
basic modes of reaction to the ward situation. 
At the other extreme are behaviors which 
simply involve the patient’s own verbal de- 
scriptions of things. The patient is then free 
to shape and distort his descriptions within a 
wide range of possibilities that fit his needs. 
However, within this area of verbal report, 
there may be distinguished those descriptions 
which have important ego-involving signifi- 
cance to the patient and which might there- 
fore be twisted by him in a self-enhancing 
manner, At quite a different level are verbal 
reports regarding matters that have relatively 
limited emotional significance and which the 
patient has only minor need to distort in 
terms of his own self-protective attitudes. 
Thus, a patient might self-protectively warp 
his answers concerning how much he likes 
the hospital or his ward physician, but be 
without temptation to do so in reply to sim- 
ple factual questions concerning how much 
he participates in sports or the kind of rec- 
reation he enjoys. In line with his conceptual 
scheme, the following behaviors were selected 


for study: 


A. Those behaviors which are relatively difficult 
for the patient to influence or to camouflage in a 
self-protective, socially approved direction. Three 
measures are included in this category. 

1. Rate of bacteriological conversion. 

2, Evaluations made by nurses and aides of the 
patient’s actual overt adaptation to the ward situa- 


tion. y 

3. The differentiation between patients who leave 
the hospital prematurely before treatment is com- 
plete and those who remain for an optimum treat- 
ment period. This differentiation involved 26 of 140 
patients who left prematurely and a comparison 
group of 45 patients randomly selected who re- 
main r full treatment. ; 

E Verbal reports concerning issues which lays 
relatively low ¢g0 involvement for the patient an 
which are likely not to be camouflaged. Two meas- 


ures are included here: è 
1. A group of ma i 
tient’s relationships wit peer: n ch 0 
Period just pecan his hospitalization. zie im 
mainly refer to such factors as the numi er a 
kinds of social organizations in which members! oe 
was held; preferred forms of recreation with friends; 
number of friends; number of friendships forgien 
the army; and degree to which army friends ips 
have carried over into civilian life. The scoring of 
the answers is intended to evaluate how actively and 
fully the individual has interacted with his peers. 
Two-thirds of the items refer to childhood and 


tems concerning the pa- 
s from childhood to the 
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adolescent behavior rather than to adult behavior. 
It was considered that these questions would be 
relatively nonthreatening because their apparent in- 
tent was vague and because most of them were 
phrased in a bland innocuous fashion. 

2. A group of three items concerning the occupa- 
tional level of the patient’s parents. These were ques- 
tions that simply requested information concerning 
the type of work done by the parents and how 
steadily they were employed. 

C. Verbal reports concerning issues that have rela- 
tively high ego-involving significance and are likely 
to result in camouflage and distortion. Four measures 
are embraced by this category: 

1. Forty-five questions concerning how the pa- 
tient feels about various aspects of his immediate 
situation in the hospital. These questions required 
the patient to express to the interviewer his opinions 
about his physician, the nurses, the aides, the quality 
of food served, and so forth. It was considered that 
such questions tapped opinions which the average 
patient would be embarrassed to voice openly. For 
example, if he disapproved of his physician or 
thought the nurses were inefficient he might be 
anxious that an open expression of such feeling 
would get him into trouble. Thus, his tendency 
would be to play down the negative aspects of his 
attitudes. 

2. A cluster of three questions relating to how the 
patient currently perceives the attitudes of significant 
figures outside the hospital toward his hospitaliza- 
tion. The questions requested information as to how 
alarmed his dependents were by his illness; whether 
his dependents favored or did not favor his initial 
hospitalization; and whether any person significant 
to him was pressuring him to hurry up and leave 
the hospital. It seemed likely that such questions 
would probe into areas of high tension (eg., guilt 
about the plight of dependents) and elicit defensive 
disguised verbal replies. 

3. A cluster of nine questions that refer to how 
well the patient’s parents supplied him with a stable 
environment with consistent rules and limits. These 
questions concerned how much time the parents 
spent at home, the kind of discipline they adminis- 
tered, and whether they were given to unusual drink- 
ing or drug addiction. It was assumed that such 
questions would elicit defensive responses from pa- 
tients because they touched on a deeply personal 
aspect of one’s relationship to his parents (viz., 
discipline and punishment) and because they re- 
quired reports about socially highly disapproved as- 
pects of behavior (viz., drinking and drug behavior) 
of the parents. 

4. A group of ten questions regarding the patient’s 
ability to get along with authority figures from the 
time of grade school to the present. These questions 
variously inquired concerning such issues as past ar- 
rests, failures in school, and failures in the army. 
Obviously, there would be marked temptation to 
distort answers to questions that had the intent of 
extracting so much negative information concerning 


one’s past. 
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Fantasy Measures 


Ten cards on the TAT were administered 
to each patient: 1, 2, 4, 12M, 8BM, 6BM, 
17BM, 18BM, 6GF, 7BM. In choosing TAT 
measures which would be most likely to pre- 
dict a wide range of behavior of tukereniar 

i special consideration was given 

eg oer situation in which this behavior 
occurs. The average patient who comes to be 
treated for tuberculosis in the VA hospital 
setting in which the present study was carried 
out is required to adjust to a very new way of 
life. He enters into a situation in which he 
has to change radically many of his previous 
ways of doing things. He has to endure a va- 
riety of procedures which are restraining and 
frustrating. He has to give up many of his im- 
mediate life goals with the hope that in so 
doing he will better his long-term prospects. 
In this setting his physician becomes a cen- 
tral figure whose decisions take on magnified 
importance. These decisions determine how 
much freedom he has in the hospital and of 
course are always fraught with implications 
concerning how well the treatment is pro- 
gressing. The patient comes to attach exag- 
gerated importance to the words and gestures 
of his physician; and his mood tone may 
fluctuate up and down as he ascribes first 
one and then another significance to such 
words and gestures. 

These special features of the hospital treat- 
ment situation suggested that two kinds of 
fantasy variables might be particularly perti- 


nent as predictors of the behaviors of the tu- 
berculosis patient: 


1. The hospital situation is one 
quires an unusual degree of immediate passivity from 
the patient. Vet, it also requires a long-term active 
or aspiring attitude in the sense of being willing to 
put up with immediate frustrations in anticipation 
of attaining basic future goals. It therefore seemed 
logical that a fantasy measure concerned with 
achievement and aspiration should be related to as- 
pects of the tubercular patient’s behavior. So many 
of the tubercular patient’s problems seem to cluster 
about issues of activity vs. passivity that one would 
expect his fantasies in this area to be meaningfully 
linked to his patterns of response. The TAT index 
which was selected to get at this dimension of fan- 
tasy is a measure developed for a previous study 
(3). It is an Achievement score based on the num- 
ber of instances in which story characters are de- 
scribed as having high aspirations, engaging in un- 


which clearly re- 
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usually hard work, attaining financial success, doing 
an excellent job, being obligated to great effort, k 
adopting a laudable aim. The score is simply, 
count of all instances in which such characteristic 
are depicted. An interscorer rank-order reliability © 
.62 was obtained on ten independently scored TA’ 
records. ; 
2. The fact that the tubercular patient finds him- 
self in a situation where certain authority figures 
(viz., physician and nurse) take on great prominent 
in his life suggested that his fantasies about atha 
figures and mother figures would have particula! 
relevance. Since so many decisions about his treat- 
ment and his activities are in the hands of his phy- 
sician and nurse, one would assume that his fan- 
tasies about the male authority figure and the female 
authority figure would have a significant bearing 0? 
many of his responses in the hospital situation. The 
TAT measure used to get at this dimension was also 
devised for an earlier study (3). It is based on A 
analysis of the stories given to Cards 2, 6BM, an 
7BM, which are cards that seem frequently to evoke 
the maximum number of themes concerning mothe 
and father figures. The measure is concerned win 
the definiteness or clarity of the parental images me 
are projected into these TAT pictures. In a preview’ 
study (3) it was shown that the definiteness of suci 
images is significantly related to certain basic P 
sonality characteristics. The theory underlying 4 
measure is that parents who stand for something 
definite (whether it is positive or negative) prov} 2 
their children with well-defined values; whereas PE 
ents who are weak or fluctuating in their position B 
various issues leave their children without clear-C 
standards of judgment. A parental figure in any the 
story was scored as definite if the story described 5 
parent in a clearly domineering or unfriendly role 
in a clearly favorable or friendly role. A pare ae 
figure was scored as vague or weak if he was too 
scribed as inadequate or if the story data were i 
fragmentary or unclear to permit classification i 
the “definite” category. An interjudge reliability in 
-81 on the dichotomous scores was obtained ia] 
twenty independently scored protocols. A SP ate 
adaptation of this basic scoring procedure was ally 
lized. It was assumed that the patient would usua < 
project images of both the father and mother was 
ures in his reactions to Card 2. Each image that age 
definite was given a score of +1. If either es j 
was unclear or weak, it was given a score O aa 
Thus, a maximum definiteness score of +2 PaE 
minimum definiteness score of — 2 could be Paard 
for Card 2. Either +1 or — 1 was scored for ald, 
6BM and also for Card 7BM. The subject CO 5 


P t 
therefore, obtain a total score ranging from +4 
—4. 


No predictions were made concerning t 
direction of differences that might be obtain 
from the Aspiration score and the Tore 
Definiteness score. It was simply Hypothes = 
that both scores would tap fantasy areas tha 


| 
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Table 1 


Chi-Square Tests of Differences in Various Behaviors Between Those High and Those Low in TAT 
Parental Definiteness and Those High and Low in TAT Achievement 


Parental Definiteness 


Group w/ 
higher 


Behaviors score So 


High Achievement 
pees cS 
Group w/ 
higher Level of 
score x significance 


Level of 
significance 


Verbal Behavior: 
Attitude toward 
hospital situation 
Family attitude 
toward hospitalization 
Goodness of parental 
behavior 
Difficulties with authority 
Parent occupational status 
Response to peers 
Nonverbal Behavior: 
Rated ward behavior ss 
AMA vs. MHB* MHB 
Rate of bacteriological 
conversion 


* AMA = Group leaving, hospital against medical advice. 
MHB = Group remaining in hospital 


significantly influence behavior in the unique 
tuberculosis treatment situation. But more 
specifically, it was hypothesized that the fan- 
tasy scores should be most significantly linked 


with behaviors that the subject cannot con- 


sciously camouflage Or which he would have 


little motivation to camouflage. 


Results 


The results shown in Table 1 tend to cor- 
roborate the expectations underlying this 
study. The Parental Definiteness score Sig- 
nificantly differentiates subjects who are 
above the median and below the median in 
four behavior areas. Individuals who have the 
most definite parental images are slower in 
their rate of bacteriological conversion, less 
likely to leave the hospital before fully 
treated, more likely to describe themselves 
as having full, satisfying relationships with 
peers, and more likely to describe their par- 


ents as being of high occupational status. 


What is most important about these signifi- 
cant differentiations is that they all involve 
behaviors which are considered to be difficult 


to camouflage or of @ sort that one would 
have little motivation to «gress up.” None of 


01 


4.0 
3.8 


.05-.02 


MHB 05 


1 for maximum hospital benefit. 


the behaviors which have ego-involving im- 
port and which are subject to conscious 
manipulation could be predicted from the 
Parental Definiteness score. 

The results shown in Table 1 concerning 
the Achievement score are in the same direc- 
tion. Of three significant differences obtained 
none fall outside of those behaviors which are 
considered least likely to be dissimulated. 
Those subjects with the higher Achievement 
scores are more likely to remain in the hos- 
pital for their full treatment; they have a 
greater probability of being rated by nurses 
and aides as adjusting well on the ward; and 
they are more likely to describe themselves 
as having satisfying involvement with their 
peers. 

There is some temptation to try to account 
for the specific direction of the significant 
differences obtained. But this would be a long 
involved task and is not pertinent to the main 
objective of this paper which was to demon- 
strate that certain modes of behavior are 
more directly linked with fantasy (whether 
positively or negatively) than are other 
modes. The pattern of results suggests that 
fantasy and behavior are not two different 
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realms, but rather that they are intimately 
connected. It would appear that the appro- 
priate question is not whether fantasy influ- 
ences behavior, but rather whether behavior 
conceptualized in certain ways can be pre- 
dicted from measures derived from specific 
defined conceptualizations of fantasy data. 


Summary 


The purpose of the study was to relate two 
measures of fantasy derived from the TAT 
to a variety of behavioral measures obtained 
from a group of persons hospitalized for treat- 
ment of tuberculosis. It was hypothesized that 
the fantasy measures would predict best those 
behaviors least subject to camouflage by the 
subjects. The pattern of results was signifi- 
cantly in the predicted direction. 
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Validities of Abbreviated WAIS Scales’ 


Eileen Maxwell? 


Fordham University 


Much discussion has been generated in re- 
cent years over the use and validity of short 
intelligence tests whether they be specially 
constructed scales or abridgments of those al- 
ready in use. Proponents offer numerous rea- 
sons for employing such scales, highlighting 
their utility for the use of courts, schools, in- 
stitutions, and military centers in the rapid 
estimation of intellectual level. Critics have 
attacked this practice, stressing the less satis- 
factory reliability and validity of brief tests. 

Various combinations of selected subtests 
of the Wechsler-Bellevue Intelligence Scale 
(WB) have been recommended by Cotzin 
and Gallagher (2, 3), Cummings, MacPhee, 
and Wright (4), Geil (6), Gurvitz (7), Hil- 
den, Taylor, and DuBois (9), Kreigman and 
Hansen (11), Patterson (13, 14), Rabin 
(15), Springer (17), etc. The worth of their 
shortened scales has been questioned by Mc- 
Nemar (12) who pointed out that the sam- 
ples used in these studies were far from those 
of normal populations, being either too homo- 
geneous or too heterogeneous. 


tion between the sum 0 
scores and the Full Scal 


based on a sample representa 


Population and that, when WB subtests were 


used, validity coefficients should be computed 
from the group data obtained from the stand- 


1 This article is based on sections of a dissertation 
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Sung the preparation of the dissertation. 
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ardization of the WB. Since the raw data 
necessary for these correlations were not avail- 
able, McNemar devised a formula utilizing 
the intercorrelations of subtests. Through ap- 
plication of this formula, he ascertained the 
ten best teams of two, three, four, and five 
subtests of the WB. 

A revision of this scale appeared in 1955 
as the Wechsler Adult Intelligence Scale 
(WAIS). Composed of six verbal and five 
performance tests, the WAIS yields Verbal, 
Performance, and Full Scale Scores, as did 
the WB. Employing data gathered in the 
WAIS standardization, Doppelt (5) was the 
first to publish a study of the effectiveness of 
an abbreviated WAIS scale in estimating Full 
Scale Score. He arbitrarily decided upon a 
four-part scale and chose the two verbal and 
the two performance subtests which corre- 
lated most highly with the Verbal and Per- 
formance Scales, respectively. The resultant 
scale consisted of the Arithmetic, Vocabulary, 
Block Design, and Picture Arrangement sub- 
tests. In all seven age groups used in the 
Doppelt study, the correlation coefficients be- 
tween the sum of the scaled scores of the four 
subtests and the Full Scale Score were 95 or 
96. Regression equations for predicting the 
Full Scale Scores for each age group were 
presented. 


Procedure 


The present study proposed to discover 
through the use of McNemar’s formula the 
best abbreviated scales of two, three, four, 
and five WAIS subtests. The investigation is 
based on the performance of the 300 persons 
in the 25-34 year age group used in the WAIS 
standardization. The required statistics for 
this group are available in the published 
manual (19, Table 8). 
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ous investigators are contained in Table 7. 
The coefficients of corresponding WAIS scales, 
as determined in this study, are shown for 
comparison between the WB scales reported 
for clinical populations and the WAIS scales 


Table 7 


Correlation Coefficients Between WB Abbreviated 
Scales and Full-Scale Score as Reported for 
Clinical Populations and Correspond- 
ing WAIS Coefficients 


rwith 
WAIS 
r with total 
WB (Present 
Scale Investigator total study) 
DPA Gurvitz (7) -90 917 
Patterson (13) 81 917 
Cotzin and 
Gallagher (3) .75 .917 
CA Patterson (13) 85 870 
Herring (8) —* -870 
V BD Hilden (9) .91 .929 
V PC Hilden 89 914 
SV Hilden 88 -885 
V PA Hilden 88 -909 
CAS Rabin (15) 80 912 
Hunt (10) 78 912 
Springer (17) 92 912 
CAPA Patterson (13) 89 -922 
Herring (8) —* 922 
CVDs Patterson (13) 93 914 
S VBD Hilden (9) .937 -938 
ISVBD Kriegman and 
Hansen (11) .91 .957 
CSDBD Patterson (13) .93 947 
VCPCBD Patterson (13) 96 951 
Herring (8) —* 951 
I DSPC PA Geil (6) 952 -949 
CS PA BD Cotzin and 
Gallagher (2) -936 -950 
C ADSPCBD Herring (8) — 966 


*No coefficients were re; i 
recommended these scales, Ported by Herring, who merely 
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based on a normative population; the superior 
validities of the WAIS scales are again evr 
dent. Y 

Examination of the intertest correlations, 
as presented in the manuals (18, Table 41; 
19, Table 8) for the pertinent age group, 1 
veals that of the 45 correlations among thé 
ten WB subtests all but seven are lower than 
those in the corresponding WAIS data. Since: 
these statistics are elements used in determin 
ing correlation Coefficients, they have obvi 
ously contributed to the larger 7s of 
WAIS scales, 

The influence of the inclusion of the Vo 
cabulary subtest in the WAIS is also seen p 
be significant. With few exceptions the cof 
relation of each subtest with Vocabulary ¥ 
higher than with the remaining nine subtests 
The validities of all abbreviated scales at 
affected since Xr; and Xr, components a 
the formula for Correlation, include interoo 
relations with the Vocabulary test. While thi 
tends to result in higher correlations for E 
abbreviated scales, those containing Vocab 
lary reflect more directly the fact of high? 
intertest Correlations, 

Data for the WATS reveal Vocabulary 4 
the most reliable subtest, having the smallest 
standard error of measurement. It is to be rê 
8retted that reliability coefficients for the nA 
subtests were not published with the ot i 
standardization data, since a consideration 
changes in reliability between WAIS and 
subtests might have proved alien f 
the discussion of abbreviated scale validi 7 
Probably the major reason for the higher 


F en 
in this study is the higher reliability of t” 


WAIS subtests, and undoubtedly some of m 
changes in composition of the best ahh 
ated WAIS and WB scales result from shi 
in relative reliability of subtests. Ae 

The best abbreviated scales of WAIS Y E 
bal subtests are presented in Table 8. The Í fe 
Crease in 7 with increase in number of subtes 5 
is evident in these verbal scales as in the 4 
breviated scales presented in Tables 2 thro 
5. The coefficients approach the limiting va 


-N 


of .95, the figure reported by Wechsler as e 
correlation between the Verbal Scale and i 
full WAIS. As a check on procedure, the ©? 
eficient for the full Verbal Scale, i.e., all 5% 
verbal subtests, was computed by a speci4 


Validities of Abbreviated WAIS Scales 


FA of McNemar’s formula and found to be 

Table 9 presents the best abbreviated per- 
formance scales according to the present cri- 
terion. It may be noted that these yield 
correlations with the Full-Scale Score consid- 
erably lower than the verbal scales. The 7 for 
the combination of all five performance sub- 
tests was .928, as compared to .92, Wechsler’s 
value for the correlation between the Perform- 
ance Scale and the full WAIS. 

Both Tables 8 and 9 have been included to 
Serve the particular needs of an examiner. 


Table 8 


Number of Possible Combinations and Correlations 
Between Full-Scale Score and Best Abbreviated 
Scales of One, Two, Three, Four, Five, 
and Six WAIS Verbal Subtests 


Abbreviated 
scale r 


Possible 


Scale length combinations 


Single subtest 6 


Orande 


Duads 15 


n>HHH 
Lanan 
oe 
© 
a 


Triads 20 


Omm 
raaney 
apanda 
2 
`O 


Tetrads 15 


HHHH 
NAKrAAN 
GUNNS > 
aaa 


Pentads 6 


HOHHHH 
Pranna 
ANN p SS 
ypouvon 
Laaa 


Verbal scale 1  ICASDV  %% 


* Th Sci Ki 
from e.%S,0f single subtests with Full-Scale Score are taken 
m the WAIS Manual (19, Table 8). 
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Table 9 


Number of Possible Combinations and Correlations 
Between Full-Scale Score and Best Abbreviated 
Scales of One, Two, Three, Four, and Five 
WAIS Performance Subtests 


Possible Abbreviated 
Scale length combinations scale r 

Single subtest Sy PC 1784 
PA rii 

BD -76 

DS ea 

OA 65 

Duads 10 PC PA 878 
DS PC 864 

BD PA -862 

PC BD 853 

DS PA 853 

Triads 10 DS PC PA 914 
DS BD PA -906 

DS PC BD -904 

PC BD PA -904 

PC BD OA 895 

Tetrads 5 DS PC BD PA 935 
DS PC PA OA 918 

DS BD PA OA -905 

DS PC BD OA -902 

PC BD PA OA .900 

Pentads 1 DSPCBDPAOA 928 


* The rs of single subtests with Full-Scale Score are taken 
from the WAIS Manual (19, Table 8). 


The selection of an abbreviated scale depends 
upon the situation in which it must be ad- 
ministered and the purpose for which it is be- 
ing given. After the criteria of time, clinical 
intent, and required accuracy have been es- 
tablished, these tables may be examined to 
find the appropriate test or abbreviated scale. 


Summary and Conclusions 


The validities of all possible abbreviated 
WAIS scales of two, three, four, and five sub- 
tests were determined in this investigation. 
The coefficient of correlation between the full 
WAIS and the sum of the particular subtest 
scores was computed by a variation of Mc- 
Nemar’s formula and this 7 was considered a 
measure of the validity of the abbreviated 
scale. The reference group for this study was 
the 300 men and women in the 25-34 year 
age group used in the WAIS standardization. 
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The Effect of Distrust on Some Aspects of Intelligence 
Test Behavior 
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The Seton Psychiatric Institute 


This experiment is concerned with the ef- 
fect of trustful compared to distrustful atti- 
tudes upon the Picture Completion (PC) and 
the Similarities (Sim) subtests of the Wechs- 
ler Adult Intelligence Scale (WAIS) (4). The 
hypotheses tested were: 

1. Those Ss who are distrustful respond to 
the instructions of the PC and Sim subtests 
with a tas’.-inappropriate distrustful attitude. 
They tend to think, “There is nothing miss- 
ing in that picture,” or “There is no simi- 
larity between those things.” Such inappro- 
priate responses impair performances on these 
tests and are reflected in comments made 
during the test. 

2. Instructions given to Ss designed to 
make them distrustful of the experimental 
situation have a similar impairing effect on 
PC and Sim performances because of the 
task-interfering attitudes aroused. Comments 
indicating distrust are increased by such in- 
Structions. R 

3. No predictions are made concerning the 
interaction of Ss’ predisposition and the ex- 
Perimental instructions. Those Ss who are in- 
clined to be distrustful may be made more so 
if the situation conforms to their expecta- 
tions. Trustful Ss may be made distrustful. 
On the other hand, it is possible they will not 
Perceive the experimental situation as dis- 
trustful and will not react to it as such. 


Procedure 


Four groups of 10 Ss each were used. Two 
8toups were chosen from Ss rated as highly 
we The author is grateful to The Seton Psychiatric 
ing tute and to Doctor Leo H. Bartemeier for mak- 

8 this research possible. “a 

ee ini wings 
Mins Ma Rosewood State Training School, g 
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distrustful (HD groups). The remaining two 
groups were rated low in distrust (LD 
groups). One HD and one LD group were 
given experimental instructions designed to 
engender a distrustful attitude (IN groups). 
One HD and one LD group were not given 
distrust producing instructions (NIN groups). 


The Ss were rated for distrust according to their 
responses to a 24-item inventory which was ad- 
ministered to 148 female student nurses. Eleven of 
the items were culled from the Minnesota Multi- 
phasic Personality Inventory (MMPI) scale for 
paranoia (1). These items were changed slightly in 
order to conform to the Ss’ background. The re- 
maining 13 filler items were from the L scale of the 
MMPI. The L score variable was controlled so that 
the four groups did not differ with respect to this 
variable. The items were worded so that they per- 
mitted a five-category response ranging from entirely 
trustful to entirely distrustful. Each critical item 
was then scored from 1 to 5 and the scores totaled. 
The highest score was taken to be indicative of the 
most distrustful attitude. The scores ranged from 14 
to 45. The mean score was 27.0 and the SD was 
4.6. LD Ss had scores of 21 or less; HD Ss had 
scores of 33 or higher. All Ss used were within the 
upper or lower 15% of the distribution. The inven- 
tory was administered independently of the experi- 
ment and two weeks prior to it. Two of the 40 Ss 
asked if a relationship existed between the experi- 
ment and the questionnaire. 


The 20 NIN Ss were given the WAIS Vo- 
cabulary (V), PC, and Sim subtests in that 
order. At the start of the experiment they 
were told that the examiner was comparing 
student nurses with other groups. The 20 IN 
Ss were given instructions designed to induce 
a distrustful attitude. They were presented 
with the V subtest in the same way that the 
NIN groups were. However, at the conclusion 
of the V subtest, the examiner announced that 
he had lied and that he was conducting the 
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experiment for other purposes about which 
he would later inform the Ss. An impossible- 
to-solve block design problem was then pre- 
sented. After 90 seconds of coping with this 
problem, each S was told that she was once 
again deceived and that the problem could 
not be solved. The PC and Sim subtests were 
then administered. All tests were presented in 
accordance with the WAIS manual instruc- 
tions. Verbatim responses and spontaneous 
comments were recorded. 

The dependent variables consisted of the 
discrepancy of the scaled PC — V and Sim — 
V scores appropriate to each S’s age. Since 
the scaled scores are z scores and are based 
on each S’s age group, age and general level 
of intelligence (as is judged by the V test) 
were controlled. If a 20-year-old S had raw 
scores of 55, 13, and 16 on the V, PC, and 
Sim subtests, her weighted scores on each 
of these subtests were 12, 9, and 11 respec- 
tively. Her PC — V discrepancy score would 
have been — 3 and the Sim — V discrepancy 
score would have been — 1. The V scale was 
chosen as a base for measuring impairment, 
because it is highly correlated with the total 
WAIS score. A priori the V subtest would not 
seem to be sensitive to a distrustful attitude. 
PC and Sim subtests were chosen as depend- 
ent variables, because they were thought to 
be more sensitive to distrustful attitudes. The 
instructions used in administering these tests 
would allow for disbelief. 

The responses and spontaneous comments 
were also examined for expressions of disbe- 
lief. For the PC, these included expressions 
such as “Nothing is missing from this pic- 
ture,” “Is there always something missing?,” 
and “Nothing that I can see . . . ,” etc. Ex- 
pressions of disbelief on the Sim included: 
“They are not alike,” “. . . Opposite... ,” 


* 2 
Table 1 


Mean Impairment Scores of Groups 


Group 
LD LD HD HD 
Tests compared NIN IN NIN IN 
PC-V =22 -15 —32 —31 
Sim—V +2 —5 —12 —13 
(PC—V) + (Sim—V) —2.0 —2.0 -44 —44 
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Table 2 
Mean Number of Comments Indicating Distrust 


Group 
LD LD HD HD 
Test NIN IN NIN IN 
PC 5 18 1.1 2.1 
Sim 4 9 1,2 9 
PC + Sim 9 2.7 2.3 3.0 


“Praise and punishment .. . alike did you 
say?,” etc. The number of disbelief state- 
ments for each S for the PC and Sim subtests 
was tabulated. 


Results 


Data concerning impairment as a function 
of Ss’ distrustful attitudes and experimental 
conditions appear in Table 1. HD Ss com- 
pared to LD Ss are significantly impaired 0n 
the PC (F = 5.04, 1 and 36 df, p= .02).° 
Impairment on the Sim was also significantly 
greater for HD than LD Ss (F = 4,96, 1 and 
36 df, p = .02). When impairment on both 
the PC and Sim is combined, it is seen that 
HD Ss compared to LD Ss are more signifi- 
cantly impaired (F = 9,72, 1 and 36 df, ? 
< .01). Instructions designed to create a dis- 
trustful attitude apparently do not produce 4 
significant impairment for either HD or LD 
groups or for both combined. Nor do instruc- 
tions interact significantly with Ss’ predispo- 
sitions to affect performance. 

Table 2 contains data relevant to com- 
ments indicating distrust. When the PC alone 
is considered, HD Ss compared to LD Ss a 
not tend to make more remarks indicative © 
distrust. The IN groups make more ee 
ments indicative of a distrustful attitude 
= 7.57, 1 and 36 df, p< 01). The a 
tion between experimental conditions anne 
predispositions was not significant. Consi A 
ing the Sim, the only finding attaining a $ A 
nificant level was that HD Ss compared al 
LD Ss produce more spontaneous detur 
comments (F = 3.20, 1 and 36 df, p= 9 ; 
Combining the PC and Sim comments, it Teal 
found that HD Ss verbalize more distrust 

3 F values using 1 df were converted into t values: 


S ues 
All probability estimates derived from such val 
are based on one-tailed tests of significance. 


and §; 


Show t 
tend hat 


Effect of Distrust 


Co: 
= tan y Ss (F = 4.41, 1 and 36 
Ss respite “ye e IN Ss compared to NIN 
distrust (F 2 more comments indicating 
The ititeractio, ae, 1 and 36 df, p < 01). 
ispositions a between instructions and pre- 
: im oie significant with the PC 
Summary, evidence was provided to 
to ngage associated with distrust 
tions as well as a function of Ss’ predisposi- 
tudes. Int ll as experimentally induced atti- 
nificantly ellectual impairment is more sig- 
isttustiul, Rese by Ss who tend to be 
Ssociated bes snes conditions are not 
Ugh the “ee intellectual impairment, al- 
sheen y lead to comments which indicate 
tw 
tude stim o POthesized that a distrustful atti- 
Cause it ulates an interfering response be- 
apase hia i the task-appropriate re- 
ere js eing made. People who say, 
ate T ee missing in that picture!” 
tO th ora to internal needs rather than 
ae a sia situation. If this is the case, 
Ween interf be a positive correlation be- 
ited Scor ering verbal responses and im- 
a aeiia Of 40 Ss tested, nine Ss made 
an oth th e remarks classified as distrustful 
ad Sim j e PC and Sim. Their combined P 
produced ™pairment was 4.4 units. Sixteen Ss 
i at on hoe or only one interfering COM- 
haired ty the PC and Sim. These Ss were 
p een E 2.7 units. The mean difference 
Ka 06). <i Ss was 1.75 (t = 1.62, 24 dh, 
emo ti us there is some indication tha 
by 8ence responses, as indicated by the 
an im of verbal comments, are followed 
Paired intellectual performance 


D The x Discussion 
a dipos indicate that Ss’ distrustful 
te tmancers ae correlated with impaire 
tugs iste, on, the WAIS PC and Sim sub- 
true Of “ee Ss will verbalize their atti- 
Shoy, Sug accepting the test instructions as 
res thar estive evidence was obtained 
ate Ases Whi distrustful attitude stimulates 
is , be avi ich interfere with task-approp™ 
: Or. In turn, intellectual impairment 
ropri- 


At, Part 
KEPO a function of such task-inapP 


\ Ms 
Experi es 


Tenia) a oas decieped to 22° 


on Test Behavior 429 
duce a distrustful attitude were not effective 
in inducing impaired performances, but were 
associated with spontaneous comments in- 
dicative of distrust. This datum indicates that 
the test instructions were in some measure 
effective in inducing the desired attitude. It 
is unclear why experimental instructions pro- 
duce task-interfering responses but do not 
ellectual performance. There are 
in the strength of a distrustful 
Verbal comments might emerge 
rapidly as a result of instructions. However, 
Ss may retain their ability to recover and as- 
iate attitude. Possibly the 


sert a task-appropr! 
instructions produced some distrust, but not 
enough to hinder intellectual functioning. 


The experiment generally indicates that 
character traits ffect WAIS perform- 


may al 
ance. It is possible that paranoid conditions 
which generally are hi 


ghly associated with a 
distrustful attitude would be revealed by low- 
ered WAIS PC and Sim scores and that such 
scatter would constitute a diagnostic sign. 
Schofield’s review of the literature (3) does 
not mention su 


ch findings. Numerous person- 
istics make up & par 


impair int 
gradations 
attitude. 


R ris anoid state. 
on Genco traits may tend to enhance 
a given intellectual 
the perceptual alertness of 
ke him very sensitive to 
(2, p. 80) so 
ffect of his 


skill. For example, 


as to neutralize 
distrustful attitude. 


Summary 
thesized that distrustful atti- 
ted in intellectual behavior as 
ed Wechsler Adult Intelli- 
Completion and Similari- 
A distrustful attitude is 
s for an interfer- 
k-appropriate 


Jt was hypo 
tudes are reflected 
measured by impair 
gence Scale Picture 
ties subtest scores. A ‘ 
hypothesized to be a stimulu: 
ing response which prevents tas 


om being made. 
ee se f 10 Ss each were tested. Two 


ako aoe red highly distrustful and 
ee low in distrust. Distrustful atti- 
Pens measured by : U One 

i w distrustful group were give 
me EN Dr jns designed to induce 
ge toward the experimental 
groups were given 
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differentiate among psychoneurotic, organic, 
and schizophrenic groups” (1, P- 375). 
tion that the scale is lacking 
in diagnostic efficacy, however, does not nec- 
essarily indicate that it is lacking in reli- 
ability nor does it negate its usefulness for 
s other than diagnosis. Diagnosis is 
hardly the sole raison d’être for psychologi- 
cal tests, and it is possible that memory tests 
could make valuable contributions to two 
broad areas of activity: guidance and re- 
search. For example, memory tests might be 
in helping the individual who 
reach decisions 
js future mode of life. They also have 
investiga’ 
e investigation of Morrow 
of the effects of gross brain 
on memory test performance). The 
and research activities 
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is the WMS measuring some- 
d in the intelligence 


d is acute. 


study rep? 
cts of WMS: 


:oọn with the Wechs' 


is 
not measure! 


scale? 
Procedure 


B, Forms I and Il, 


The WMS and the W-B, J 
in detail in their respective 


re described 10 
11, 12). ‘Although there are two 


a 
(10, 


manuals 
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Table 1 


Range, Mean, and Standard Deviation of Age, Educa- 
tion, Wechsler-Bellevue IQ, and Wechsler 
Memory Scale Scores of 150 Males 


Variable Range Mean SD 
Age 25-34 29.11 2.77 
Education 6-18 10.81 2.56 
Wechsler-Bellevue 
Full Scale IQ 78-136 107.87 12.18 
Full Scale Score 57-148 105.49 17.76 
Verbal Scale Score 26-79 52.41 10.90 
Performance Scale Score 30-73 53.08 9.45 
Wechsler Memory Scale 
Information and Orientation 10-11 10.75 -= .62 
Mental Control 2-9 6.92 2.01 
Logical Memory 2-20 10.01 3.85 
Digit Span 7-15 11.65 2.07 
Visual Reproduction 1-14 9.74 2.99 
Associate Learning 7-21 15.93 3.27 
Total of Subtest Scores* 37-87 65.01 10.13 
Memory Quotient 63-143 104.95 16.98 


* The total before addition of the age correction points. 


forms of WMS, only Form I was used in this 
study. 

The subjects of the study were individuals 
tested in the Clinical Psychology Section, 
Neuropsychiatric Service, Bronx VA Hospital, 
during the period from 1946 through 1949. 
During this period the WMS and the W-B 
were given routinely by some members of the 
staff and by some of the clinical psychology 
trainees. The diagnoses of the subjects are 
varied (e.g., diabetes, phantom limb pain, 
hepatitis, etc.), but the majority, 55%, had 
behavior disorder diagnoses (neurosis, imma- 
ture personality, etc.). 

The medical records of all patients who re- 
ceived both the WMS and the W-B and were 
in the age group from 25 through 34 were ex- 
amined, and all individuals who fulfilled the 
research criteria were selected. The criteria 
were: (a) not psychotic, (b) no symptoms 
attributed to brain damage, (c) no history 
of head trauma, (d) electroencephalogram, if 
done, reported as normal, (e) no history of 
shock therapy, (f) if Negro, not educated in 
a southern state, and (g) not educated in a 
foreign country. A total of 150 individuals 
(all males) passed the research criteria. Of 
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the total group, 86 were given W-B, Form I, 
and 64 were given W-B, Form II. In the cor- 
relations with WMS, the scores from W-B I 
and II are pooled. ‘ 


Results 


Table 1 shows the range, mean, and stand- 
ard deviation of the age and education of the 
group as well as their WMS and W-B scores: 
Chi-square tests of the shape of the Fu 
Scale IQ and the Memory Quotient (MQ) 
distributions indicate that both approximate 
a normal curve (x? of 2.3170 with 7 dh, 9 
> p> 90; x? of 7.9987 with 7 df, 50>? 
S 130). 

The internal consistency of five of thé 
WMS subtests and of the total scale wa 
tested by the method (coefficient alpha) ag 
scribed by Cronbach (2); the reliability E 
efficients are shown in Table 2. Since ng 
variability of the Information and Orient 
tion subtests is infinitesimal, analysis of thes 
two subtests was not done. pe 

In order to secure information about E 
relationship between specific subtests, the } 
tercorrelations of five of the WMS subte® 
with each other and with total score We 


Table 2 
Reliability of the Wechsler Memory Scale 
Coeficient 
3 a 
Subtest and subtest items Variance alph 
Information and Orientation .389 383 
Mental Control 4.020 $ 
Counting 20-1 487 
Alphabet 1.154 
Counting by 3’s 1.353 814 
Logical Memory 14.816 “ 
Paragraph 1/2 4.973 
Paragraph 2/2 3.812 641 
Digit Span 4.280 : 
Digits Forward 1.122 
Digits Backward 1:773 634 
Visual Reproduction 8.925 j 
Design A .552 
Design B 1.950 
Design C-1 1.253 
Design C-2 .921 368 
Associate Learning 10.692 4 
Easy Words/2 453 
Hard Words 8.274 
6 
Total Score 102.698 


i 
iN) 
f Ver 


\ 


ty Echg; a 
Ao ‘Om, ler intended the MQ to be 
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Table 3 


Intercorrelations of Wechsler Memory Scale Subt 


ests and of the Memory Scale with the 


Wechsler-Bellevue Intelligence Scale 


Correlation coefficient 


Variable Logical - Visual Assoc. Hard Total 
W memory Digits repro. learn. words score 
& 
_ Memory F 

e) 

Hel Control 369 474 325 268 236 AAS 
Digits | Memory i ‘306 (359 «T16 “405 "324 
Visual (394 342 321 ‘527 

ocia produc, 298 284 458 

e Lea 473 

e Words 482 
a et Bellevue 

vert Seal 750 

perbal Scone ‘690 

Srformance Score ‘536 


* 
Wit 
TWD the relev. 
th the olevant subtest eliminated. 
he Digit Span te 
compu 


t 5 i 
(368) €d. Since the reliability coefficient 


dicates 4 the Associate Learning Subtest in- 
Ot at a at the two items of the subtest are 
me Sugg equivalent and since logical analy- 
tion ; €sts that it is the Hard Words por- 


me , th ich learning activity is most 1 
he e 5 Hard Words score was correlated 
al Score Our other subtests and with the to- 


o 

Telationsiae® information about the degree of 
= total ip between the WMS and the ¥'~ 
the 28e c MS score (before the addition of 
he Woy otection score) was correlated with 
Wess. Verbal, Performance, an Full Scale 
pa and igit Span is a subtest in poth the 
Ped theres S, and the score of this subtest 
all tes, Sing ore, eliminated from te ases 
th SIX of the in all but nine of the 150 cas 
te “lintnat: W-B Verbal subtests were Siver» 
goon t a ation of Digit Span had little effect 
Pan e Prorated Verbal scores (wit 


Digit 


c ae 
uded, mean and standard deviation 
respec- 


tj a 
oy). Aneores are 53.69 and 10.97; : 
Wa in Tables correlation coefficients ar 
«directly 
(11, P- 
Jowet 
tly cited a5 
actice © 


9 Para ; 
w ne to W-B Full Scale IQ 
o the pu, Clinical practice, an 
idene ull Scale IQ is frequen 
Of brain damage. This Pt 


i however, without any knowledge about 
oe the correlation between IQ and MQ or 
the size of difference that can be considered 
significant. The correlation coefficient of IQ 
and MQ in this study is .767. The standard 
deviation of difference scores (IQ minus MQ) 

is 10.687; the range 1S from +40 to — 21. 
In an effort to determine whether the mean 
difference between IQ and MQ varies with IQ 
of variance was done. The 


+. however, is not significant 
(F of 1.04 with 3 and 146 df). 
Discussion 


WMS in Diagnosis 
e diagnostic usefulness of the 
the focus of this study, two of 
pertinent to the validity of 
mptions underlying present 


Use of 

Although th 
WMS was not 
the findings are 
of the assu 
actice: 
tercorrelations among WMS 
that score differences are SO 
hat they have no prac- 
ictive value. In the voluminous lit- 
on learning tasks (and most of the 
S subtests are simple learning tasks) low 
. sorcorrelations are consistently reported (3, 
ae findings in the present study, there- 
be explained as nothing more 


some OF 
diagnostic pr 
1, The low in 


ubtests indicate 
E nd so large t 
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than the natural consequence of a preponder- 
antly neurotic sample. The assumption that, 
in the absence of demonstrable brain pa- 
thology, individuals will perform at approxi- 


mately the same level on all the WMS sub- - 


tests is not tenable. 

2. In 24% of the cases in the study re- 
ported here, IQ exceeded MQ by 24 and more 
points. Such large deviations occurring in a 
group so carefully screened for brain damage 
indicate that the degree of comparability be- 
tween the two quotients is not great enough 
for use in individual diagnosis. 


Reliability 


The reliability coefficients indicate that the 
subtests and, in several instances, the subtest 
items have a low degree of measurement 
equivalence. Two of the subtests, Mental Con- 
trol and Associate Learning, have so little in- 
ternal consistency that it is difficult to justify 
the segregation of their items into subtests, 
Observation of patients suggests that most of 
the variance which occurs in Mental Control 
is error variance—variance contributed by 
such a startle reaction as: “Say my ABC’s! 
The doctor must really think I am crazy!” 

The degree of internal Consistency that is 
optimum for a test is, of course, an empirical 
question and cannot be determined without 
specification of what the test is intended to 
predict. Unfortunately, memory tests have 
been constructed in the absence of any ap- 
Preciable effort to describe and analyze the 
aspects of human behavior which the term 
memory function designates. (For further dis- 
cussion of the definition of memory, see Ing- 
ham [7].) Since the WMS is seemingly based 
on an ambiguous “common-sense” definition 
of memory, it would be difficult to ascertain 
just what degree of reliability would be opti- 
mum. When one takes into account, however, 
the relatively low reliability and extreme brev- 
ity in conjunction with the range of function 
level which the test attempts to encompass, 
it seems highly unlikely that the WMS is ca- 
pable of yielding data which are sufficiently 
accurate to be useful in predicting behavior. 

The level of subtest difficulty is so dis- 
parate that it is questionable whether the en- 
tire test should be given to many individuals. 
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Observation of the reaction of patients sug- 
gests that several of the subtests (viz., Infor- 
mation, Orientation, Mental Control, and the | 
Easy Words item of Associate Learning) are. 
so simple that they are appropriate only mg 
those instances in which there is gross disor- | 
ganization of intellectual processes. Individu- 
als who, in a hospital setting, are asked to 
complete items which “any child would know 
are apt to be either frightened to death or M- 
sulted by the implications of the request: 
Neither reaction is conducive to securing data 
which are representative of the individual's 
behavior under more usual circumstances, 
The analysis of the WMS and of the types ° 
problems with which clinical psychology k 
concerned suggests that improvement in t i 
predictive accuracy of psychological tests a 
cessitates the abandonment of tests which a 
measures-of-everything and the constructio! 
of a variety of univocal tests, each test i 
signed to secure samples of very limited re 
carefully specified aspects of human functio. 


Correlation of WMS with W-B 


Wechsler says that, in the standardizati 
group of “about 100” with an age rangs ilele 
25 to 50, WMS age group means “para o 
very closely that of the Performance par 
the Bellevue” (11, p. 88). From the con ; 
in which the statement occurs, one infers be, 
the memory score showed an age aoc 
similar to that of W-B Performance. gh 
relatively young age group studied heg i 
ever, the relationship between WMS mes ee 
Verbal is greater than that with W- pos” 
formance (see Table 3). It is, of course, pip 
sible that in older age groups the relation 
is reversed. 5 

The overlap in measurement between we h 
and W-B is fairly large (r of Memory 


4 jew 
and W-B Full Scale score is .750). In vi co? 


the scale’s brevity, relatively low intern ‘ance? 
sistency, and many sources of error vat a 
one must ask: Once the communality jano? 
W-B is removed, is the remaining va" nce? 
of WMS anything other than error var A 
Unless it can be demonstrated empi 
that WMS contributes unique and true ca 
ance to studies of human behavior, on 


id 
not justify giving both tests. | 
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Subtest Disparity of Negro and White Groups 
Matched for IQs on the Revised Beta Test’ 


Walter A. Woods 
Nowland & Company 


and Robert Toal’ 
Medical College of Virginia 


The report of a study by Woods, Boger, 
and Holman (3) revealed that subtest scaled 
scores of the Revised Beta Examination con- 
tributed unequally to total test scores of 
Negro adolescents. Nondelinquents attained 
higher scores than did delinquents, but in 
both groups it was apparent that some sub- 
tests contributed greater weight than others. 
No significant differences were found between 
sexes. 

Two alternatives appeared to provide pos- 
sible explanations for the subtest disparity. 
Since the sample was substantially below the 
norm group mean, although above the defec- 
tive level, it could be that the subtests con- 
tributed disproportionately among groups at 
the lower IQ levels. Yet a second possibility 
was that some influence was operative in Ne- 
gro groups, and absent in white (norm) 
groups, which brought forth differential re- 
sponses to the subtest items. 

The present study attempts to throw light 
upon these two possible alternatives. 


Experimental Design 


One hundred and twenty Beta scores of 
adolescent male and female Negro subjects, 
drawn from industrial school and public 
school populations, were matched with 120 
Beta scores of white adolescents, also drawn 
from industrial school and public school popu- 
lations. Matching was made on sex and on 
total score. Since the previous study had 


1 From research conducted while authors were af- 
filiated with Richmond Professional Institute. 


demonstrated that sex differences were 
significant in the populations at issue, bere 
and female subjects were used. All subie 
were between the ages of 14 and 17, BTE 
sive. No attempt was made to match for = 
since for the purposes of our experiment ag 
matching was unnecessary. “3 fei 

Scores were classified according to Six Ne 
els of performance, from high to low, for te 
gro and white groups to facilitate a a 
ments-by-levels analysis of variance. Twang 
Negro and 20 white scores were assigne 
each level. The subtest scores were Teea 
as treatments: an assumption of ee 
equality is intrinsic in the scaling, thus ae 
ing the assumption of equality of su 
means in the population. 

Negro and white performance on 
test was compared by computing 
between group means on each su 
ascertaining ¢ ratios. 


each sub- 
difference? 
btest a” 


Results 


it 
Table 1 summarizes the mean scores fte 
the subtests, by levels, for Negro and 7088 
groups. Total means are equal for oE ups 
and whites at each level, since the a test 
were matched on total score. The en í 
means at each level support the findings o 
the former study, indicating disproport® Jev- 
contributions of the subtests at differen s by 
els. When we compare the column mear Sons 
t ratio, we find that the subtest contribu ste 
are not the same for Negroes as for "i tec’ 
Whites perform better on Subtests 3 (de 
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Negro 
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Table 1 
Mean Scaled Scores of Negro and White Adolescent Groups Classified by Six Levels on 
Total Scaled Scores on the Revised Beta Examination 
Subtest Mean 
L Total 
pa Race Man 1 2 3 4 5 6 
1 Negro 71 5.1 8.4 8.4 4.5 7.2 9.0 
White 7.1 58 79 8.6 48 77 8.1 
2 Negro s4 17 10.7 8.6 49 8.2 10.2 
White 8.4 6.9 9.6 10.0 6.3 8.9 8.8 
3 Negro 9.2 s7 12 9.6 5.6 9.6 106 
White 92 8.2 10.4 9.5 7.4 9.8 10.1 
9.8 11.4 
4 N 8.7 13.0 10.0 6.2 . 
White z 2 os he 106 104 
i Negro 10.4 a m O s 10 ova 
White 10.4 oo «= = i ; 
9.2 11.3 12.7 
5 Negro 11.5 ak ng Ti ws a ui 
White 11.5 I i 
Me: 
an of means 11.6 98 64 94 10.9 
es i. gg fe 


14 5 ope -= 21s* 


* 
Significant at .05 level. AN T 
tion d, the increase m egro scor 
Of err mds other pagi ral subtests, since 
( rawing ieh 4 (paper rann board ences not consistent tee ra reater rate than 
Exceed a Pg age All of these s Ne- some scores increase 4 Be BAA 
We perform ua pe bes cae abtests others. cme eo and with subtest 
git pinan onje ES +. for this different 
These daea, ana (visual comp ecte provides some explanation 
rences al sceed Eres 
c so ex 
a 
Eoin at the 95% level. d that Table 2 
fist oy analysis of variance it is foun iysis of Variance of 120 White and 120 Negro 
levels er interaction effects of subtests oo AR "Revised Beta Scores Classified i 
z sieni of subtests with race, 2"? a test, by Level, and by Rac? 
Cant d ss jn Ta e 
Tea egree. This is show? 
sinian effects and levels effects Ra om df S F 
Nong} ant. Interaction of race and levels ! Source 
intrig cant, The levels effect js, of cours 5 501.34 199.36" 
tests git the design. The significant SUP” subtest 5 581.83 304.19 
Subtest ooe supports the hypothesis that ae i 7 46.56 17.8* 
contribution is unequal. subtests by Levels z 61.09 23.4" 
leve] examination of the mea? scores by subtests by Race 5 00 .00 
dica, OF each : ‘> groups 1” Levels by RACE zo 958 367" 
Cates ch subtest in the white 8" are le by Subtests by Race i372 26l 
nequa aat, although the contribution’ gr- Le ividual Differences i 
fonat 7, êt each I they increase Pi 
at a evel, they } $ 
to pily with increase in total score 27° the Zcignificant at -05 level. 
n 


Ci 
Ome equal at the higher levels- 
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pattern among the white as contrasted to the 
Negro group. Race with levels interaction is 
not significant, while race with subtest is. 
Thus, the second-order interaction is explain- 
able only in terms of the disproportionate in- 
crease in the Negro group. This dispropor- 
tionality appears to be produced by several 
of the subtests—more markedly by Subtests 
3 and 6, the means of which increase with 
smaller increment than do the means of the 
other subtests; and Subtests 1 and 4, which 
increase with greater increment, 

These findings indicate that scaled scores 
on the Beta Examination are disproportion- 
ate due to level—some of the subtests con- 
tributing more, others less, at lower levels of 
ability—and due to racial (and this may be 
cultural) differences. This latter hypothesis 
is supported both by the disproportionality 
at different levels, particularly at lower lev- 
els, and by the second-order interaction. 

The tests on which Negroes tend to per- 
form better are essentially tests which re- 
quire perceptual speed and accuracy. It has 
been reported in the literature on racial in- 
telligence differences (1, p. 491) that Ne- 
groes in our society seem to have little in- 
centive to do things rapidly. In our groups 
matched on total score, our Negro sample 
was not handicapped by this supposed dis- 
inclination to work rapidly. 

Two of the subtests on which Negroes are 
inferior to whites (3 and 5) are culturally 
“loaded”; that is, they contain items which 
appear to be common in our culture, but are 
not equally common to all cultural segments. 
The third, Subtest 4 (paper form board), 
seems related to judgments involving spatial 
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visualization. In conceptualizing the visual 
material presented in this subtest, and men- 
tally manipulating it, a higher level cognitive 
process seems to be required, which is not 
needed for the successful performance on any 
of the other subtests. Thus, it appears that 
Negroes, when compared with whites of 
“equal ability,” are most deficient in cul 
turally loaded items and in items which re- 
quire ability to visualize spatially. They nee 
superior to whites in items requiring pe" 
ceptual speed and accuracy. ' : : 
We also find greater subtest disparity @ 
lower levels for both Negroes and whites. 
Differences exist at lower levels which a 
not so much in evidence at higher leve p 
This was particularly apparent in the m 
and paper form board tests, which are ae 
lieved to involve, respectively, ability to p a 
ahead and ability to mentally manipula’ 
and to visualize spatial arrangements. nia 
It is suggested that other studies in “ee 
Negroes and whites are matched on par 
lat variables may prove revealing. 
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Revised Administration and Scoring of the 
Digit Span Test’ 


Harold L. Blackburn and Arthur L. Benton 
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Bee is well known that, under standard con- 
the ax of administration (e.g-, in the WAIS), 
ne auditory yace digit span test has a rela- 
Ab y low reliability. For example, Derner, 
a, and Canter (1), studying the test- 
a H reliability of the Wechsler-Bellevue sub- 
N s in a sample of 158 normal Ss, found the 
t rrelation coefficient between test and retest 
i be .67, this value ranking ninth among the 
1 subtests of the scale. Moreover, in sum- 
K areg the findings of seven studies dealing 
Ta the performances of neuropsychiatric pa- 
RA these authors report a median test-re- 
k correlation coefficient of .65 for the digit 
ee an estimate which is hardly suggestive 
or Satisfactory reliability. Yet the test plays 
ie role in clinical diagnosis, pat- 
haat arly as an “anxiety indicator” and as a 
z is for inferences about the presence of 
erebral injury or disease. 
li m: seems reasonable to sup 
ity of the digit span for these and other 
Purposes would be enhanced if a procedure 
that yielded a more reliable estimate of the 
abilities involved in this performance were 
utilized. Augmentation of the test-retest re- 
liability of a task such as this can be rather 
easily accomplished by increasing the number 
of trials, adopting a systematic order of pres- 
entation and determining 4 relatively stable 
threshold value. However, While this psycho- 
Physical procedure is appropriate for experi- 
orted by 2 research 


pose that the va- 


1 This investigation was SUPP! 
Erani (5-616) Trompie National Institute of Nen- 
eae gical Diseases and Blindness, of the National In- 

Tentes of Health, BEET Health Service. 
ne writers are greatly indebted to Dr. Leonard S. 
M t for valuable suggestions and criticisms and to 
ti r. Richard C. Jentsch for assistance in the collec- 
ion of the data. 


mental work, it is much too time-consuming 
for routine clinical use. On the other hand, it 
is possible that some relatively slight Seats 
cations in procedure and scoring, which would 
not necessarily involve an important increase 
jn administration time, might provide esti- 
mates that are significantly more reliable than 
those secured with the standard administra- 
tion and scoring. 

This possibility was explored in the present 
study which investigated the effects of two 
minor procedural modifications and a revision 
in scoring on the test-retest reliability of the 
task. The procedural modifications consisted 

S repeat or reverse both sets 


of: (a) having 
of digits of a given series length even when he 
had correctly repeated or reversed the first 


set of the pair; and (b) terminating the repe- 
tition or reversal of digits after three succes- 
sive failures rather than two. The revision in 
scoring consisted in giving credit for each set 
of digits correctly repeated or reversed rather 
than by the conventional “highest score” 
method.” 
Procedure 

The digit span performances of three main 
groups of Ss were investigated: (a) 100 col- 
lege students; (b) 105 nonpsychotic patients 
on various services of the University Hospital 
and Veterans Administration Hospital, Iowa 
City; (c) 61 patients with confirmed or sus- 
pected cerebral disease who were on the neu- 
rological or neurosurgical services of these 


hospitals. 


2 Justification 
vided in the int 


for this revision in scoring is pro- 
ensive study of digit span perform- 
ance by Peatman and Locke (3), who found that 
it yielded consistently higher test-retest reliabilities 
than did the “highest score” method. 
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Approximately half the Ss in each group re- 
ceived the WAIS administration of the digit 
span. Following interpolated tasks of 20-30 
minutes duration, they were retested with the 
same administration. The other half of each 
group received the revised administration 
(hereafter referred to as the BB administra- 
tion) of the digit span under the same con- 
ditions, i.e., retest after an interval of 20-30 
minutes. Each type of administration could 
be further subdivided into two types, accord- 
ing to whether two or three successive fail- 
ures were taken as the criterion for terminat- 
ing the task of repeating or reversing the 
digits. Thus, four administrative procedures 
were available for comparison with respect to 
the test-retest reliability of the performance 
estimates, viz.: 

WAIS II; Standard WAIS administration. 

WAIS III; Standard WAIS administration 
except that the task is terminated after three 
successive failures. 

BB II; Requiring S to repeat or reverse 
both sets of digits of a given series length and 
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termination of the task after two successive 
failures. 

BB III; BB II procedure, except that the 
task is terminated after three successive fail- 
ures. 

Scoring. Performances under the WAIS ad- 
ministrations were scored in the usual man- 
ner, i.e., the “highest score” method. Perform- 
ances under the BB administrations were 
scored in terms of the number of sets of digits 
correctly repeated or reversed, one point be- 
ing given for each correctly reproduced set, 
always beginning with three digits forwar 
and two digits backward. No basal score was 
added to this simple frequency count. 


Results 


The mean ages and educational levels of 
the Ss in the several subgroups are presented 
in Table 1. It is evident from inspection that; 
within each group, the two subgroups who re- 
ceived the different administrations are fairly 
comparable with respect to these character- 
istics. The three largest differences (ages ™ 


Table 1 


Age, 


Educational Background and Level of Test Performance in the Several Groups 


Initial Test 


Age Education 
Group N (years) (years) II Score III Score 
College WAIS 50 Mean 20.9 13.2 11.7 12.0 
SD 3.7 11 2.1 feh 
College BB 50 Mean 19.8 13.1 15.7 16.1 
SD 1.8 1.0 3.7 af 
Patients WAIS 56 Mean 39.9 9.6 10.2 10.4 
SD 11.4 2.2 2.2 22 
Patients BB 49 Mean 36.6 9.8 13.1 133 
SD 10.9 2.5 3.2 3i 
Brain-Injured WAIS 30 Mean 41.6 10.5 9.1 9.5 
SD 13.0 3.9 2.1 ae 
Brain-Injured BB 31 Mean 37.5 9.7 10.5 105 
SD 11.9 2.3 3.0 2 
Total WAIS 136 Mean 23 114 10.5 108 
SD 13.6 2.9 24 2. 
Total BB 130 Mean 30.3 11.0 13.5 13.7 
SD 12.2 26 3.9 4.0 


' 
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Revised Administration of the Digit Span Test 


Table 2 


Test-Retest Reliability Coeficients Under the Four 
Administrative Procedures 


Procedure 
eee n 


WAIS WAIS BB BB 

Group m a ce m 
College 6 68 79 8 
Nonpsychiatric Patient 71 61 81 82 
Brain-Injured ws g 80 79 
Combined 70 6& 30 3 


R o eee 
the nonpsychiatric patient and brain-injured 
patient groups; educational levels in the 
brain-injured group) were analyzed by means 
of the ż test and found to be nonsignificant. 

Table 2 shows the test-retest reliability co- 
efficients for each group under each type of 
administration. It will be noted that the re- 
liability coefficients resulting from the BB 
administrations are consistently higher than 
those yielded by the WAIS administrations. 
t is also evident that employing the criterion 
Of three successive failures for terminating the 
task does not improve test-retest reliability. 

The homogeneity of the sets of three test- 
retest reliability estimates for each adminis- 
tration was assessed by a chi-square test de- 
Scribed by Rider (4). Since all four tests 
yielded nonsignificant chi-square values, each 
set of estimates was combined by the method 
of average zs and a single reliability coeffi- 
Clent for each administration obtained. These 
total reliability estimates are also shown in 
ae e 2. Assessment of the significance of 

€ differences in the size of these total reli- 
ability Coefficients, based on a comparison of 

€ differences in zs with the standard error 
of their difference, showed that the BB I 
administration was more reliable than the 
i IS II administration to 4 questionably 
‘nificant degree (.10 > $> 05), that the 
pals III administration was significantly more 
eliable than the WAIS II administration 
cos > p> .02), and that both BB adminis- 
ations were significantly more reliable than 
nos TIT administration (01>? > 
t is possible that the higher reliability of 
BB procedures results from the introduc- 
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tion of new elements into the task that can 
be more reliably measured. To investigate this 
possibility, an estimate of the correlation be- 
tween the two types of scores was secured 
from a separate study of the WAIS II and 
BB III performances of 98 college students, 
half of whom received the WAIS II adminis- 
tration followed, after a 20-30 minutes pe- 
riod, by the BB JII administration, and the 
other half of whom received the two adminis- 
trations in reverse order. The correlation co- 
efficient between the two administrations was 
765 for the first half-group and .767 for the 
second half-group. Correction for attenuation 
due to lack of perfect reliability of the meas- 
ures raised this correlation coefficient to .96, 
suggesting that the two tests are imperfectly 
reliable measures of the same true component. 

Determinations of administration time. 
Mean time for the standard (WAIS II) ad- 
ministration was found to be 3 min., 46 sec. 
in a group of eight college students. Mean 
time for the WAIS II administration was 
found to be 4 min., 4 sec. in this group of Ss. 
In a second group of seven college students, 
who were comparable in respect to level of 
performance with the first group, the mean 
time for the BB II administration was found 
to be 4 min., 46 sec., exactly a minute longer 
than for the standard administration. Mean 
time for the BB III administration was found 
to be 5 min., 7 sec. for these Ss. Thus, as 
compared with the standard administration, 
the BB II administration involved an in- 
crease of 1 min., 21 sec. in administration 


BB III and WAIS II scores. To 
tical use of the BB III adminis- 
tration, a table of equivalent BB IlI-WAIS 
II scores was constructed by equating the two 

1 scores (136 WAIS scores and 


ets of initial 
130 BB scores) through employment of the 


ipercentile method applied to estimated 

on butions of true scores, aS described by 
Flanagan (2). This table of equivalent BB Tl 
and WAIS II raw scores, together with the 
corresponding WAIS scaled score equivalents, 
is presented in Table 3. In this table, the 
WAIS II equivalent raw scores within the 
range of 6 to 26 were derived empirically. 
they were derived by logi- 


Outside this range 
Oe etrapõlatioi: The SD of the WAIS 


time. 
Equivalent 
facilitate prac 
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equivalent raw scores was found to be 4.51, 
which was not significantly lower than the 
SD (5.65) of the original WAIS II raw 
scores, indicating that the transformation did 
not result in an important reduction in vari- 
ability as would have been the case had it 
been effected by means of a regression equa- 
tion. 

Repetition and reversal of digits. The total 
digit span test was separated into its “for- 
ward” and “backwards” components and the 
test-retest reliabilities of each component un- 
der the WAIS II and BB III administrations 
compared. Under the WAIS II administra- 
tion the test-retest reliability coefficient of 
digits forward was found to be .51. Under the 
BB III administration the same statistic was 
.15. Under the WAIS II administration the 
test-retest reliability coefficient for reversal 


Table 3 


Transformation Table for Prediction of WAIS II Raw 
Scores and WAIS Scaled Score Equivalents 
from BB III Raw Scores 


WAIS IL 
BB III WAIS II Scaled Score 
Raw Score Raw Score Equivalent 
28 17.0 19 
27 16.9 18 
26 16.8 17 
25 16.0 16 
24 15.6 16 
23 15.3 15 
22 14.8 15 
21 14.6 14 
20 14.2 14 
19 13.6 13 
18 12.8 12 
17 12.3 11 
16 11.7 11 
15 112 10 
14 10.7 10 
13 10.3 9 
12 9.6 8 
11 9.0 X 
10 8.5 7 
9 8.0 6 
8 7.2 5 
ri 6.8 4 
6 6.4 3 
5 6.0 2 
4 55 2 
3 5.0 1 
2 4.0 1 
1 3.0 0 
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of digits was .64. Under the BB III adminis- 
tration it was .71. 

Observations on children. The test-retest 
reliabilities of the four administrations were 
examined in a group of 77 children within 
the age range of 6-12 years, 39 of whom re- 
ceived the WAIS test-retest procedure and 38 
of whom received the BB test-retest pro- 
cedure. The obtained reliability coefficients 
were as follows: WAIS II: .76; WAIS III: 
.77; BB II: .88; BB III: .89. 

Observations on mental defectives. Test-re- 
test reliability of performance under the four 
conditions of administration was also investi- 
gated for 77 mental defectives, 39 of whom 
received the WAIS test-retest procedure and 
38 of whom received the BB test-retest pro- 
cedure. The obtained reliability coefficients 
were as follows: WAIS II: .89; WAIS III: 
88; BB II: .94; BB III: .95. 


Discussion 


The findings on the main groups of adult 
Ss indicate that slight modifications in the 
administration and scoring of the digit span 
test result in a significant increase in the test- 
retest reliability of the task. Findings Or 
smaller groups of children and mental defec- 
tives are consistent with this conclusion. With 
the standard administration, the test-retest 


- reliability in a heterogeneous group of adult 


Ss was found to be .70. With the revised ad- 
ministration, the reliability coefficient in a 
comparable group of Ss was found to be .80 
— .81. Since the revised administrative pro- 
cedure generally involved about an additional 
minute in administration time, its employ- 
ment in the clinical examination would see™ 
to be indicated. 

The table of equivalent scores is designed 
to facilitate routine clinical use of the revise 
administration. While in all probability either 
the BB II or the BB III administrative pr°- 
cedure would serve equally well for the pre- 
diction of WAIS II raw score equivalents, 
certain minor considerations made it seem 
wise to select the latter for this purpose- Its 
test-retest reliability was found to be slightly 
higher than that of the BB II procedure. 
Moreover, in the main comparison of rel- 
abilities, the BB III procedure was more T°- 
liable than the WAIS II procedure to a degree 


6 


i 


— 


oy = 


Revised Administration 


eee clearly acceptable (p < .05) while 
PA amaa in test-retest reliability between 
Bit II and WAIS II procedures did not 
‘toed reach the 05 level in respect to signifi- 
saan the difference in administration 
ina etween the two BB procedures was 
Eat to be minimal (mean, 21 sec.), no 
l ical disadvantage is associated with em- 
ee of the BB III procedure. 
= transformation table is designed ex- 
e ly for the interpretation of the perform- 
tive a adult Ss who are not grossly defec- 
aa here is no justification at this time for 
Pee oyment with children or mental de- 
ives, 
he ae of digit span performance into the 
eae components of repetition and re- 
of th, of digits indicated that the superiority 
fact £ BB administration was based on the 
me it effected an increase in the reli- 
3 y of the task of repetition of digits while 
Bee atively low reliability of the task of 
aan digits remained essentially un- 
Peri Bed. It must be concluded that the su- 
ority of the BB administration with re- 
the Ee test-retest reliability applies only to 
y igit span test as a whole and not to 
Comparisons of digits forward and digits 


ba 

Ckward which may be made. Since this type 

e, OMparison is often made by the clinical 
r procedural 


chine exploration of furthe a 
abp cations designed to augment the reli- 
b ity of the reversal of digits task ought to 


* attempted. 


Summary 
mogat’ study explored the effect of certain 
e p cations in administration and scoring on 
The test-retest reliability of the digit span- 
§ 8° Modifications consisted of: (a) having 
panat or reverse both sets of digits of & 

N series length even when he had cor- 
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rectly repeated or reversed the first set of the 
pair; (b) terminating the repetition or re- 
versal of digits after sree successive failures 
rather than two; (c) giving credit in scoring 
for each set of digits correctly repeated or 
reversed rather than by the usual “highest 
score” method. 

2. It was found that these modifications re- 
sulted in a significant increase in the test-re- 
test reliability of the task. This increase in 
reliability was effected primarily through aug- 
mentation of the reliability of performance on 
the “digits forward” component of the task. 
Under both the standard and revised adminis- 
trations, the “digits backward” component 
showed unsatisfactory reliability. 

3. The time required for the revised ad- 
ministrations was found to be approximately 
g0 sec. longer than that required for the 
standard administration. 

4. In order to facilitate clinical use of the 
revised administration a transformation table 
was constructed whereby both raw scores de- 
rived from the standard administration and 
WAIS scale score equivalents can be predicted 


from raw scores derived from the revised ad- 
ministration. 


Received July 26, 1956. 
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The Relationship of the WISC and Stanford-Binet 
to School Achievement’ 


Ernest S. Barratt and Doris L. Baumgarten 
University of Delaware 


The Stanford-Binet and the WISC are both 
widely used in clinics and schools to estimate 
intellectual capacity. This study is designed 
to relate scores on the WISC and the 1937 
Revision of the Stanford-Binet (Form L) to 
scores on the California Achievement Tests 
(reading and arithmetic subtests) for 30 
achievers and 30 nonachievers in grades 4 to 
6. Achievers and nonachievers were defined by 
teachers’ ratings of students’ school perform- 
ance. 

The results of achievers on the WISC were: 
Full Scale (FS) IQ, M = 117.47, e = 9.81; 
Verbal IQ (V), M = 121.17, e = 10.30; Per- 
formance IQ (P), M = 110.10, «= 11.46, 
Results of nonachievers were: FS, M = 86.90, 
o = 12.46; V, M = 88.23, ø = 13.07; P, M 
= 91.50, o=11.69. On the Binet, the 
achievers’ mean was 126.47, c= 11.99; the 
nonachievers’ mean was 88.27, e = 13.28. 

The vs between reading achievement and 
IQ scores of achievers were: WISC FS, .56; 
WISC V, .61; WISC P, .29; Binet, .62. For 
nonachievers these 7s were: WISC FS, .63; 
WISC V, .51; WISC P, .30; Binet, .46. 

The 7s between arithmetic achievement and 
IQ scores for achievers were: WISC FS, .14; 
WISC V, .09; WISC P, .14; Binet, .12. For 
nonachievers these 7s were WISC FS, .79; 
WISC V, .73; WISC P, .33; Binet, .52. 


1An extended report of this study may be ob- 
tained without charge from Ernest S. Barratt, De- 
partment of Psychology, University of Delaware, 
Newark, Delaware, or for a fee from the American 
Documentation Institute. Order Document No. 5106, 
remitting $2.00 for microfilm or $3.75 for photo- 
copies. 


The rs between WISC subtests and reading 
achievement for achievers were: Inform., .58; 
Comp., .22; Arith., .50; Simil., .61; Vocab., 
46; Digit S., .18; P. Comp., — .15; P. Arr., 
32; BI. Dsg., .52; O. Assemb., .36; Coding, 
-15; Mazes, — .02. For nonachievers these 7s 
were: Inform., .16; Comp., .09; Arith., .41; 
Simil., .23; Vocab., .25; Digit S., .09; P. 
Comp., .20; P. Arr., .22; Bl. Dsg., .00; O. 
Assemb., .36; Coding, .46; Mazes, .00. 

The rs between arithmetic achievement and 
IQ scores for achievers were: Inform., .11; 
Comp., — .02; Arith., 18; Simil., .11; Vocab., 
— 54; Digit S., — .04; P. Comp., .04; P 
Arr., .11; Bl. Dsg., .15; O. Assemb., .18; 
Coding, .12; Mazes, .00. For nonachievers 
these rs were: Inform., .38; Comp., .42; 
Arith., .43; Simil., 43; Vocab., .51; Digit S., 
— .07; P. Comp., .35; P. Arr., .39; Bl. Dsg., 
= O. Assemb., .25; Coding, .34; Mazes, 

For achievers, one intelligence test is not 
a better predictor than the others of either 
reading or arithmetic achievement, The same 
conclusion is true for the nonachievers except 
for the WISC P score which is not as highly 
related to arithmetic achievement as are the 
WISC V and FS scores, 

Several observations lead to the general 
conclusion that obtaining a true measure of 
arithmetic ability in nonachievers is difficult 
because of their difficulties in using verbal 
symbolism. One distinguishing characteristic 
of the achievers in this study is their rela- 
tively high verbal ability. 

Brief Report. 
Received December 4, 1956. 
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Would suggest that self-acceptance is 
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Self-Acceptance and Psychopathology 


Marvin Zuckerman* and Irwin Monashkin 
Larue D. Carter Memorial Hospital, Indianapolis, Ind. 


Rogers (7) has suggested that self-accept- 
ance is a good criterion for progress in psy- 
chotherapy. Supporting this idea is the find- 
ing that the self concept becomes more like 
the ideal concept during the course of psy- 
chotherapy. Implicit in it is the assumption 
that self-acceptance varies directly with ad- 
justment. In a later work (8), Rogers does 
Comment that high self-ideal correlations may 
Indicate either genuine adjustment or defen- 
at behavior. However, he still regards the 
‘ae in self-ideal correlations during psycho- 

erapy as indications of increased adjust- 
Ment in the group. 

i Berger (1), and Block and Thomas (5) 
ound that self-acceptance in college students 
Was negatively correlated with certain clinical 
Scales of the MMPI and positively correlated 
With the K scale. If the clinical scales are 
Considered criteria of adjustment, the correla- 


ions between self-acceptance and these scales 
related 


the positive corre- 
tance and the K 
ted as a meas- 


lag semen However 

A ìon between self-accep 
ù ale, which has been interpre 
te of defensiveness, suggests another hy- 
Pothesis, The self-satisfied subjects may be 
Maladjusted but defensive oF lacking insight 
into their condition. Block and Thomas (5) 
Seem to favor this latter hypothesis. They 
vlew the extremely self-accepting subjects as 
Overcontrolled” and denying. If this view 1s 
Orrect it may be possible to view the correla- 
ions between clinical MMPI scales and self- 
“ceptance as indications of the common 1n- 
Uences of defensive processes rather than 
e€ common influence of underlying adjust- 
ent. 


The purposes of this study are: 
“Ni titute for Psychiatric Research, 
X eh cen 27% Indianapolis, Ind. 


diana University Medical Center, 


1. To see if the relationships found be- 
tween self-acceptance and particular MMPI 
scales in college students can be replicated in 
a sample of psychiatric patients. 

2. To see if there is any relationship be- 
tween self-acceptance and adjustment using 
an external criterion: a rating of adjustment 
based on the case history. 


Subjects 


The subjects were 43 psychiatric pa- 
tients, including 18 men and 25 women. 
They were all new admissions who had taken 
the MMPI and Shipley-Hartford Vocabulary 
scales. Those with below average Shipley 
scores were not used. Eighteen of the patients 
were diagnosed as psychoneurotic or person- 
ality trait disorder, 22 were diagnosed as 
schizophrenic, and 3 were diagnosed as psy- 


chotic depression. The mean age was 34; 
mean education was 12 years of school; mean 
vocabulary score was 29.3 (equivalent to an 


IQ of about 111). 
Procedure 


The scale used to measure conceptions of 
self and ideal was adapted from a scale de- 
veloped by Buss at this hospital. It consists 
of 16 subscales covering clinically relevant 
dimensions. A list of eight scaled adjectives 
describes the points on each subscale. The S 
rates by choosing one of the eight words 
which he feels is most descriptive of himself, 
and the scale value of that word is used as his 
score for the self concept on that subscale. 
The Ss were first asked to rate themselves as 
they are “in general.” On a second copy of 
the scales they were asked to rate themselves 
as they “would like to be.” For each S$ a dis- 
crepancy score was computed by subtracting 
the scale values of the ratings on the ideal 
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concept from the corresponding ratings on the 
self concept. The signs of these differences 
were ignored and the summed differences 
were the discrepancy scores. The larger the 
difference between ideal and self, the more 
dissimilar are the conceptions, and the less 
accepting the individual is of himself. Self- 
acceptance, as defined, varies inversely with 
the size of the discrepancy score between self 
and ideal. 

Case-history rating. Adjustment was de- 
fined by ratings of the patients based on 
their case histories. The writers independently 
rated the final case summaries in the patients’ 
charts on a 7-point scale of adjustment. Fac- 
tors considered in the global rating were 
severity of symptoms, bizarreness of idea- 
tion, degree of incapacitation, acuteness, or 
chronicity of the disorder, adequacy of pre- 
morbid adjustment, and course of the dis- 
order. In general, psychotics tended to fall at 
the upper end of the scale and neurotics at 
the lower end, with some overlap in the mid- 
dle between severe neurosis and mild psy- 
chosis. 

Interrater reliability for the 43 cases was 
.77. Using the same rating on 60 cases in an- 
other study a reliability of .80 was obtained. 
The ratings of the two judges were averaged 
for each patient to get the case-history rating. 


Table 1 


Correlations Between Self-Acceptance 
and MMPI Scales 


Block Zuckerman 
and and 
Berger Thomas Monashkin 
Sample Students Students Students Patients 
Sex Men Women M&W M&W 
N 109 76 56 43 
F — = —.54* —.54* 
K Petey PSY kd .33* .38* 
Hs —.08 —.25* —.59* —.31* 
D —45*  —.54* —.63* —.42* 
Pa —03 —.26* —.62* R 
Pa = —.30* —.29* —.11 —.39* 
Pi —~—.52* —.55* —.69* —.51* 
‘Sc —.40*  —.49* —.63* —.53* 
Ma —11 —.12 —.22 +.16 
Si —.63* —.70* = —.52* 


* Significant at or below the .05 level. 
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Results 


The correlations between self-acceptance 
and MMPI scales obtained in our patient 
population are compared with the correlations 
found by Berger (1), and Block and Thomas 
(5) in Table 1. In general the results are re- 
markably similar. They are similar in spite of 
the differences in instruments used and in 
types of population. Berger used an inventory 
to measure self-acceptance, Block and Thomas 
used Q sorts of self and ideal, and we used 
adjective rating scales for self and ideal. 
Berger, and Block and Thomas used college 
students while we used psychiatric patients, 
who presumably cover a wider range of the 
adjustment continuum. The positive correla- 
tion between self-acceptance and the K scale 
is obtained in all three studies. The negative 
correlations between self-acceptance and the 
D, Pt, and Sc scales are also found in all 
three studies. The negative correlation be- 
tween self-acceptance and the Hs scale is ob- 
tained in the three studies with the exception 
of the college men in Berger’s study. The 
negative correlation between self-acceptance 
and the F and Si scales were found in the 
two studies where they were measured. The 
negative correlation between self-acceptance 
and the Pa scale was found only in Berger’s 
and this study. The positive correlation with 
Hy was found only in Berger’s male Ss, and 
the negative correlation with Pd was found in 
Berger’s female Ss, and Block and Thomas’ 
male and female Ss, 

The present results replicate six out of the 
six relationships.found between self-accept- 
ance and MMPI scales in both sexes in the 
Berger study, and six out of seven of the re- 
lationships found in the Block and Thomas 
study. 

Another way to analyze the MMPI data, 
which is closer to usual clinical practice, is to 
analyze the peaks in the profile disregarding 
the height of the profile. Table 2 shows the 
percentages of either the low self-accepting or 
high self-accepting groups (defined as those 
above and below the median) having any one 
of the MMPI scales as the first or second 
highest scale in the profile. The low self-ac- 
cepting group peaks significantly more often 
on the D and Pt scales, The high self-accept- 
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Table 2 


Percentages of Low and High Self-Accepters 
Peaking} on MMPI Scales 


Scale Low High CR 
Hs 14.3 13.0 2 
D 52.4 22.7 2.01* 
Hy 9.5 18.2 59 
Pd 28.6 54.5 1.73 
Pa 9.5 13.0 36 
Pt 42.9 9.1 2.54* 
Sc 52.4 36.4 1.06 
Ma 0.0 31.8 2.84* 


$ Significant at or below the .05 level. 
acl Peaking” is defined by a subject having the particular 
ale as the first or second highest scale in his profile. 43 u 


ing group peaks significantly more often on 
the Ma scale and shows a tendency, short of 
Significance, to peak more often on the Pd 
Scale, 

The final question in our study is the re- 
lationship between self-acceptance and ad- 
justment as defined by an external criterion, 
the case history rating. The correlation be- 
tween self-acceptance and the case history 
tating was — .06, clearly not significantly 
different from zero. The plotted data reveal 
Neither a rectilinear nor a curvilinear rela- 
tionship. 

Discussion 


Why should a tendency to be satisfied with 
Oneself be related to low scores on MPI 
Scales? Rosen (9) has found that the rela- 
tionship between self-indorsement of MMPI 
stems and personal desirability of these items 
1s 87 for both sexes. The scales which are 
Most consistent in their negative correlations 
With self-acceptance (D, Pt, Sc, Si) are pre- 
Clsely those scales most influenced by desir- 
ability of the items in Rosen’s study. The K 
Scale, which is positively correlated with self- 
acceptance, is also corre ated with the tend- 
ncy to Answer MMPI items in the direction 
of desirability. The person who is self-satis- 

ed is likely to answer MMPI items in a way 
Which he considers personally and socially de- 


Sitable. Thus, both self-acceptance and MMPI 
tales are probably being influenced more by 

e common trait of defensiveness than by 
actual adjustment. In the present study 6 of 


e 9 paranoid schizophrenics were in the high 
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self-accepting third of the distribution. De- 
spite this fact, the Pa scale correlated nega- 
tively with self-acceptance. Apparently if a 
patient is highly self-accepting, he can evade 
detection on a scale designed specifically to 
reveal his traits. Other patients who are not 
paranoid score high on the Pa scale merely 
because of a readiness to admit negative 
traits in themselves. 

Our self-dissatisfied patients tend to score 
higher on the D scale and peak more often 
on this scale than the self-satisfied patients. 
In the MMPI manual, a high D score is said 
to indicate “poor morale of the emotional 
type with a feeling of uselessness and inabil- 
ity to assume a normal optimism with regard 
to the future... lack of self-confidence, 
tendency to worry, narrowness of interests 
and introversion” (6, p. 19). The correlation 
between these traits and lack of self-accept- 
ance is not surprising. Bills found that people 
with low self-acceptance show depressive signs 
on the Rorschach (3), and tend to internalize 
blame for their problems (4). Our low self- 
accepters tend to be higher and peak more 
frequently on the Pt scale. The manual de- 
scribes the Pé scale as indicating phobias, 
compulsive behavior, mild depression and 
lack of confidence. Interpreting the other cor- 
relations found in Table 1, it seems that low 
self-accepting patients tend to describe them- 
selves as concerned about bodily functions 
(Hs), suspicious and oversensitive (Pa), pos- 
sessing bizarre Or unusual thoughts (Sc) and 
introverted (Si). 

The high self-accepting patients peak more 
frequently than low self-accepters on the Ma 
scale. Some characteristics of people scoring 
high on this scale are “active . . . enthusi- 
astic . . - disregard of social conventions.” 
Actually the most frequent peak in the high 
self-accepting group is the Pd scale. This 
scale is related to “absence of deep emotional 
response, inability to profit from experience 
and disregard of social mores.” 

The contrast betweer. the types of person- 
alities descriptive of low and high self-ac- 
cepters is marked. The high self-accepter is 
defensive, tends to act out his problems and 
externalizes blame; while the low self-ac- 
nalizes, socially withdraws, and 


cepter inter: 
suffers from depression and doubt. These find- 
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ings are consistent with Bills’s results (2) on 
the differences in the Rorschachs given by 
low and high self-accepters. 

Although self-acceptance probably reflects 
differences in modes of handling personal 
maladjustment self-acceptance is not related 
to actual adjustment in patients as seen by 
others. Self-acceptance might conceivably 
bear more of a relationship to actual adjust- 
ment in outpatients who come in voluntarily 
to receive psychotherapy. Perhaps in this type 
of group defensive patterns are less common. 
However, it is apparent that self-acceptance 
should not be used as the sole criterion for 
improvement in psychotherapy. In fact, any 
questionnaire type of test which does not con- 
trol for item desirability should not be used 
as a sole criterion since it is probably sub- 
ject to the same defensive effects that affect 
self-ratings. 


Summary 


This study was undertaken to see if rela- 
tionships between self-acceptance and MMPI 
scales found in college students could be 
replicated in patients; and to see if these re- 
lationships were due to a real relationship 
between self-acceptance and adjustment. 

The subjects were 43 psychiatric patients. 
They rated their self and ideal concepts on 
adjective scales. The discrepancy between rat- 
ings on these two concepts was used as a 
measure of self-acceptance. All patients had 
taken the MMPI. Ratings of adjustment were 
made on the basis of the final case summary 
on the patients. 

Significant negative relationships were 
found between self-acceptance and the F, Hs, 
D, Pa, Pt, Sc, and Si scales of the MMPI. 
A significant positive relationship was found 
between self-acceptance and the K scale. 


Most of these relationships replicated the re- 
sults found in college students. Using a pro- 
file analysis approach, the low self-accepters 
were found to have D and Pt as their first or 
second highest scores significantly more fre- 
quently than high self-accepters. High self- 
accepters were found to peak significantly 
more frequently on Ma. There was no rela- 
tionship between self-acceptance and adjust- 
ment as measured by the case-history rating. 

The results were interpreted in the light of 
the influence of personal and social desir- 
ability on MMPI items, and different modes 


of handling problems in low and high self- 
accepters. 
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Disorientation as a Prognostic Criterion’ 


A. Eskey, Gladys Miller Friedman, and Ira Friedman 


Cleveland Receiving Hospital and State Institute of Psychiatry 


_Anecdotal evidence lends support to the 
Sa that patients who are acutely disturbed, 
a owing marked signs of confusion and dis- 

rientation, tend to have a more favorable 
tascam than patients whose symptoms are 
Si insidious and less dramatic. Chase and 
oi ocean (2) point out that, among other 
a ces, acute onset and confusion are favor- 
able prognostic signs. Albee points out that 
Shee cies with long and intimate experi- 
ars with schizophrenics are all too familiar 
ees the patient whose relatively mild overt 
agosto gine defies every therapeutic ef- 
= ie while another patient whose symptoms 
ad to be indicative of total personality 

isorganization achieves a sudden and unex- 
Pected remission” (1, P- 208). Reviews of 
Seer signs in schizophrenia are also dis- 
L ssed by Mayer-Gross and Moore (6), 
a (4), Strecker (7), and Malamud and 
ender (5). There appears to be little doubt 
4 at marked changes frequently occur with 
oy disturbed patients, but there is no 

Ystematic investigation which attempts to 
assess whether this change occurs more fre- 


quently than with patients who display greater 


ae 
80-intactness and less extreme symptomatol- 
sent study is to 


is The purpose of the pre 
Sen uate the prognostic significance of extreme 

Mptomatology. 
Seago was select 
ent variable, because 1 


®mployed in psychiatric descrip 
asure of 


lected as the inde- 
t is traditionally 
tion. It also 
lack of 


be valid to select patients discharged as im- 
proved, and the length of hospital stay would 
tell us how rapidly they improved, i.e., prog- 
nosis with reference to duration of “illness 
rather than to degree of improvement. The 
study, specifically, is a comparison between 
an oriented and disoriented group in terms 
of length of hospital stay. 


Method 


One hundred psychotic patients exhibiting 
varying degrees and types of disorientation 
upon admission to Cleveland Receiving Hos- 
pital and State Institute of Psychiatry were 
selected at random from reports of menta 
status examinations. A control group of 100 
patients who were well oriented upon admis- 
sion was in like manner chosen. Only first ad- 
missions coming to the hospital during the 
h 1955 were used. The age 


years 1952 throug 
range was 16 to 59 inclusive, with all pa- 
tients being eliminated from the study who 


remained in the hospital less than 14 days. 
In order to control factors which were felt to 
influence length of hospitalization, the fol- 
lowing types of cases were rejected: patients 
with a diagnosis of acute or chronic brain syn- 
drome, alcoholics, mental defectives, chronic 
patients who were transferred to long-term 
treatment centers, those who were not im- 
proved upon discharge, and those who were 
released against medical advice. In brief, our 
two groups were composed of selected psy- 
nts who stayed in the hospital 


chotic patie 
two weeks and who were dis- 


Te E i 
k Presents an objective me: 
So-intactness, and it correlates highly with more than 5 : 
confusion (if not being identical with it). Be- charged from the hospital as improved or re- 
ause improvement is always 4 difficult vari- covered. ‘ 
le to measure, we reasoned that it woul The groups were matched on, the basis of 
iT i à age, SeX, and psychiatric diagnosis. In match- 
ton, buchen wish to acknowledge the omar ing, age distributions were separated into 
i igi ni . 
area of A eee in originally suggesting five-year intervals. If, for e sample, there 
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were four females and one male in the dis- 
oriented group age 30 through 34, a match- 
ing sample was obtained for the correspond- 
ing age group among the oriented patients. 
Other factors such as marital status, race, 
and type of treatment were approximately 
equal between groups. Accuracy of matching 
is apparent from Table 1. 

Disorientation in the spheres of time, place, 
and person was selected as an indication of 
confusion. In some instances the patients 
were disoriented in only one area. In a few 


Table 1 


Factors Showing Accuracy of Matching 
Between Groups 


Criterion Oriented Disoriented 
Mean Age 35.8 35.4 
Sex 
Male 35 35 
Female 65 65 
Race 
White 74 66 
Negro 26 34 
Marital Status 
Married 48 53 
Single 34 29 
Separated 3 10 
Divorced 10 6 
Widow 5 1 
Widower 0 1 
Clinical Groups 
Schizophrenia 
Paranoid 44 44 
Catatonic 20 20 
Simple 4 4 
Hebephrenic 1 1 
Chronic Undifferentiated 7 il 
Acute Undifferentiated 2 2 
Schizo-Affective 2 2 
Manic-Depressive 
Manic 2 2 
Depressed 5 5 
Involutional Psychosis, 13 13 
Treatment 
ECT 63 81 
ECT and Insulin 9 4 
Tnsulin 1 1 
Other 27 14 
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cases there was some question as to whether 
the patient was actually disoriented. In this 
latter instance three psychologists examined 
the records, and only those cases where there 
was complete agreement were included. Ac- 
cordingly, patients who knew the date within 
four or five days, for example, were not in- 
cluded in the disoriented group. The differ- 
ence in length of hospitalization for the ori- 


ented groups was subjected to statistical 
analysis. 


Results 


When the total number of patients were ar- 
ranged in frequency tables based on 30-day 
intervals of hospitalization, it became appar- 
ent that neither group was normally dis- 
tributed. Both distributions were skewed in 
that they grouped toward the low end of the 
scale, i.e., the great majority of discharges 
occurred within a 120-day period. At the op- 
Posite end of the distributions (greater num- 
ber of days of hospitalization), it was found 
that there were more disoriented than ori- 
ented patients. Because of these few extreme 
Cases, the median was used as a basis for com- 
Parison between the groups, as it is less af- 
fected than the mean by extreme scores. The 
results showed that the oriented group had a 
median of 73.9 days of hospitalization with 
an SE of 6.87, while the disoriented group 
had a median of 85.4 with an SE of 9.55. 
Figure 1 shows thé distributions. 

comparison was also made between the 
two groups Containing the largest number of 


SRIENTED ON ADMission————————] 
DISORIENTED ON | 


NUMBER OF PATIENTS 


6 


30 &@ so wo Bo o 20 240 270 300 


TIME SPENT IN HOSPITAL (IN DAYS) 


Length of hospitalization for oriented and 
disoriented groups. 


Fig. 1. 
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cases with respect to specific kinds of disori- 
entation. These groups were patients disori- 
ented for time alone and those who were dis- 
oriented for both time and place. In other 
words, we wanted to compare a group of pa- 
poe with greater confusion as reflected ‘by 
isorientation for both time and place with 
a group disoriented for time alone. Those 
isoriented for time had a median of 83.2 
days of hospitalization with an SE of 11.17 
and the time-place disoriented group had a 
edian of 89.0 days with an SE of 17.66. 
pre were too few patients with disorienta- 
n for person to include this factor in the 
comparisons which were made. 
na the reliability of the difference be- 
aon the medians of the oriented and dis- 
i ar groups was determined, a ¢ value of 
:02 (p = .30) was obtained. Thus when the 
et appropriate statistical measures were 
Mployed (comparison based on medians), 
b e results indicate no significant difference 
etween the groups. 
Mhen thë comparison was ma 
— disoriented for time alone with the 
aa ee for both time and place, the 
S ulting ¢ value of .27 (p = .80) shows no 
Snificant difference between the groups. 


de of the 


Discussion 


Ete results indicate that patients who are 
Soriented do not improve more rapidly than 
Patients with a greater degree of orientation— 
vanding contrary to current thinking. Jit may 
ee be that when an acutely disturbed pa- 
ent recovers rapidly, the change is quite 
‘amatic. On the other hand, the change ìn 
atients who do not display such intense be- 
avioral, overt symptomatology is much less 
‘king. According to Gestalt principles of 
-membering (3) we are more prone to be 


Pressed by cases where the end result, as 
: $ pa- 


d support to the 


ir bject to unwarrante 1 
ori t is also possible that the variable o! 
entation is not sufficiently representative of 
acute onset . . . confusion - ; - and atypi- 
te symptomatology” (2) to yield consistent 

Sults in the anticipated direction. It is sus 
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gested that further research may prove fruit- 
ful, taking additional variables into account. 


Summary 


The study is an investigation of the prog- 
nostic significance of disorientation in terms 
of the rapidity of improvement. One hundred 
disoriented patients were matched with 100 
oriented patients on the basis of age, sex, and 
psychiatric diagnosis. Precautions were taken 
to eliminate those patients from the study 
where certain nonrelevant factors might have 
operated to influence the length of their resi- 
dence in the hospital. Results indicate that 
there is no significant difference between 
groups with reference to length of time they 
remained in the hospital. No significant dif- 
ference was shown between patients disori- 
ented for time alone and those disoriented 
for both time and place. The results were ex- 
plained in terms of Gestalt principles of simi- 
larity and contrast, i.e. that we are more 
prone to be impressed by dramatic improve- 
ments where the patient is strikingly differ- 
ent from his initial disturbed state. While 
this study suggests our anecdotal reporting 

ted overgeneraliza- 


may result in unwarran’ 

tion, the possibility that disorientation is 

not sufficiently representative of “acute onset 
. confusion and . - - atypical symptoms” 


was pointed out. Further research is recom- 


mended. 
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On the Relation Between A-Scale Scores and 
Digit Symbol Performance 


Leonard D. Goodstein and I. E. Farber 


State University of Iowa 


In a number of recent studies, Matarazzo 
and his colleagues (3, 4, 5, 6) have been con- 
cerned with the possibility that the function 
relating scores on the Taylor A scale and per- 
formance in various other tasks involves a 
reversal of sign in the slope. Specifically, they 
have assumed that the maximum appears in 
the middle ranges of anxiety, with lower 
scores at the extremes. As Farber and Spence 
(1) have noted, this sort of relation might oc- 
casionally be expected on theoretical grounds, 
But, despite notably persistent attempts to 
verify this hypothesis, the proponents of this 
view have found scant evidence for it in their 
own studies. Moreover, contrary to their sup- 
Position (3, 6) no evidence of such a relation 
has been found in conditioning experiments, 
Spence and Taylor (7) have stated that Ss 
in the middle ranges of anxiety sometimes 
perform more like nonanxious than like anx- 
ious Ss, indicating that the relation between 
A-scale scores and conditioning performance 
may be curvilinear, i.e., nonlinear. But this is 
quite different from the notion that the rela- 
tion is nonmonotonic, with a reversal of sign 
in slope. There has apparently been some 
confusion concerning the nature of curvi- 
linear and nonmonotonic functions. 

The question remains, of course, whether 
in any specific situation there is any relation 
between anxiety and performance, and, if so, 
whether it is linear, curvilinear but mono- 
tonic, or curvilinear and nonmonotonic with 
a maximum at an intermediate level of anx- 
iety. Thus, Matarazzo and Phillips (3) have 
reported findings for a Digit Symbol task 
that might be interpreted in favor of a curvi- 
linear relation, in that ¢ tests showed that 
Ss at a low level of anxiety performed more 


poorly (p < .05) than those at two inter- 
mediate levels. There was no evidence that 
the relation was not monotonic, since the 
most anxious Ss did not differ significantly 
from those at the intermediate levels, How- 
ever, as these writers conjectured, it is pos- 
sible that this negative result was due to 
their inability to use a more extreme group at 
the high end of the scale because of the small 
N involved.: 

The present study presents the results of 
an attempt to check the foregoing results 
and hypotheses, Following the procedure of 

atarazzo and Phillips, a 175-item form of 
the Wechsler-Belleyue Digit Symbol test and 
the Taylor A scale were administered to 409 
college underclassmen, 205 men and 204 
women, 

Results 


The Digit Symbol performances of the men 
and women at six levels of anxiety are pre- 
sented in Table 1, The class intervals defin- 
Ing the levels were the same as those used by 

atarazzo and Phillips, except that a more 
extreme group was added at the high end, in 
accordance with their suggestion. The results 
obtained in the earlier investigation are also 
Presented, for purposes of comparison. 


1 The distribution of A-scale scores obtained in the 
aay by Matarazzo and Phillips differed markedly 
se that of the Iowa standardization population 
bt On the basis of the latter scores, over 20% of 

© group might have been expected to have scores 
21 or higher, A xX? test of the goodness of fit of 
their frequencies to the expected frequencies at the 
five levels of anxiety yields a value of 20.9, which, 
for 2 df, is significant at p < .001. The medical stu- 
dents who comprised the sample in the former study 


were apparently less anxious than introductory psy- 
chology students at Iowa. 
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Table 1 
Digit Symbol Scores as a Function of Anxiety Level 
Present study 
Men Women Macarzzo & 
Tye] A-Scale EEE 
e interval N M SD N M SD N M 
1 
0-5 15 106.0 17.5 14 110.9 
; i 14.5 18 
F 6-10 52 109.8 18.3 36 1164 `199 1007 
i 11-15 61 106.5 149 57 117.8 20.6 34 1098 
; 16-20 43 1054 18.2 36 1185 178 15 108.8 
6 21-25 19 101.5 15.7 26 113.3 18.5 
26-40 15 1121 144 25 107.6 16.3 11 106.4 
Total 205 1070 183 204 115.4 191 119 108.1 


ee the women’s mean performance was 
=4 cantly better than that of the men (t 
closeh Ż < .001), and that for the men more 
at ly resembled the results obtained by 
eg and Phillips, separate analyses of 
ance were applied to the data for the two 
es, In neither instance did the differences 
ee levels approach significance. For the 
fi n, the value of F was less than unity, while 
the women the F value was 1.46, which 
and 199 df is not significant at an ac- 
(.20 < p < 30). 
ere was thus no reason to suppose that 
hy relation, curvilinear, nonmonotonic, oF 
Derg ther kind obtained between anxiety and 
ormance in the Digit Symbol task. 
í legigi Pite our grave doubts concerning m 
teana I of comparing individual pairs © 
ns, in view of these results, it might be 


Xoteg * 
a in Table 1 that, for the men, the per- 
wan ance of the least anxious Ss (Level 1) 


„actually superior to that of one inter- 
fop te group (Level 4), and that the per- 
fy nce of the most anxious SS (Level 6), 
meg, OM being inferior to that of the inter- 
Yor ig groups, was best of all. However, In 
at 


a si ; ns 
5 Ingle instance was 4 ¢ between mea! 
respect to the 


1) did 


Vay 


Yo ous levels significant. In 
ict if the least anxious a 
er significantly from 
N oe of the ¢ test did nificant 
jose uces between the most 
laced two intermediate leve 
ty, Cre: estimate of the error variance 
tye DCE between means (2; P- 
evels 6 and 4 was 2.28 an 


anxious 


Js, Using an UD 
of the 


? be- 
d that be- 


tween Levels 6 and 3 was 2.20. Assuming 
(incorrectly, as we believe) that ¢ tests are 
permissible under these circumstances, both 
values are significant, p < .05. 

Similar analyses based on only the five lev- 
els used by Matarazzo and Phillips yielded 
results that were completely congruent with 
those derived from the analysis for six levels. 


Discussion 

hat these results offer little 
support to the view that any relation, non- 
monotonic or other, exists between A-scale 
scores and Digit Symbol performance. The 
specific hypotheses of Matarazzo and Phillips 
that the mean at Level 1 would be signifi- 
cantly poorer than the means at each level 
from 2 to 5 were not substantiated in a single 
instance. The specific hypothesis that Level 3 
would be superior to 6 was substantiated in 
the case of women, but the results were in 
the opposite direction for men. Taken in con- 
junction with their own findings (3); not a 
single suggested relation between individual 
levels is unequivocally borne out. Conse- 
quently, the argument that despite the non- 
significance of an over-all test, simple tests of 
significance between levels are permissible on 
the basis of previous empirical results in this 
area must be flatly rejected in the future. Of 
course this does not preclude tests of simple 
effects based on theoretical expectations. But, 
in the latter case, it would be necessary for 
the theory to predict the locus of the maxi- 
mum much more precisely than has yet been 
done- Otherwise, there is no guarantee that 


It may be seen t 
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inadvertent advantage might not be taken of 
chance differences among levels. 


Summary 


The present study was concerned with the 
hypothesis that the relation between Taylor 
A-scale scores and performance on a Digit 
Symbol task is nonmonotonic, with Ss in the 
middle ranges of the A-scale distribution per- 
forming better than those at the extremes. 
With Ss classified at six levels of anxiety, no 
consistent evidence was obtained to support 
this hypothesis or any more general hypothe- 
sis concerning a relation between A scale and 
Digit Symbol scores. 


Received July 17, 1956. 
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Some Social and Cultural Factors Determining 


Relations Between Authoritarianism and 


ae a recent evaluation of research on the 
a personality, Masling (18) criti- 
on the tendency for value judgments to be- 
ea e confounded with science in this research 
3 According to him all “bad” personality 
aa acteristics tend to be ascribed to authori- 

Mans without, in many instances, substanti- 


f ating evidence. Masling feels that the writing 


pn the authoritarian personality has implied 

positive relation between neuroticism and 

‘Noritarianism. He refutes such a relation- 

1P by summarizing four empirical studies 

at failed to find evidence of a relation be- 

oO authoritarian ideology and measures of 
Uroticism and maladjustment. 

Bis hile some of Masling’s points are well 

€n, the studies he summarizes do not suc- 

he in answering the question as to the re- 

4 i On between authoritarian personality and 

Urotic tendencies. For example, Davids (4) 

recently reported highly significant cor- 
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ang Contained in this report are those of the authors 
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relations between F-scale scores and measures 
of maladjustment derived from the Manifest 
Anxiety Scale and clinical evaluations by an 
experienced psychoanalyst. Also Freedman, 
Webster, and Sanford (9) have recently re- 
ported significant correlations between the F 
scale and the hysteria and psychasthenia 
scales from the MMPI. It was the purpose of 
the present study to investigate some of the 
factors that may be contributing to these con- 
flicting results. 

In examining the discrepancies between the 
findings of Davids and of Freedman, Webster, 
and Sanford and those reported by Masling, 
one of the more obvious differences between 
these studies is in the populations from which 
the Ss were drawn. Davids’ sample as well as 
that of Freedman ef al. consisted of university 
undergraduates, while three of the four stud- 
ies summarized by Masling employed Ss se- 
lected from nonstudent populations. In one 
case the sample was drawn from patients in 
a psychiatric clinic, while in another the sam- 
ple was selected randomly on the basis of 
census tracts in a large city. In the third 
study the sample consisted of Naval recruits. 
Even in the fourth study the sample was com- 
posed of university summer-school students 
who are not likely to be representative of un- 
dergraduates. These differences in population 
samples suggested to us that the differences 
jn findings might well be due to some com- 
plex relation between authoritarianism, neu- 
roticism, and certain sociocultural variables, 
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In order to examine this possibility, we have 
repeated Davids’ study, using a sample of 
Naval enlisted men. Contrary to Davids’ find- 
ings with university students, we expected the 
Navy Ss to yield negative results similar to 
those reported by Masling. 

In his study, Davids failed to find a sig- 
nificant relation between authoritarian ideol- 
ogy and intolerance of ambiguous auditory 
stimuli. Since the concepts of “rigidity” and 
“intolerance of ambiguity” play a central role 
in the theory of authoritarian personality (11, 
12), we wished to determine if Davids’ failure 
to find the predicted relation might have been 
due to the sample studied. Accordingly, in the 
present study, we administered the auditory 
projective test (7) to the Naval trainees, and 
again investigated relations between authori- 
tarianism and reactions to ambiguity.’ 


Method 


Subjects. Forty-eight Naval enlisted men 
served as Ss in the present experiment. These 
young men, the majority of whom were of 
college age although very few had attended 
college, were undergoing training at the New 
London Submarine Base. The assessment 
measures used in this study were included in 
the battery of procedures that is administered 


routinely to all trainees upon their arrival at 
the submarine school. 


Authoritarianism, The Ss were administered the 
30-item California F scale (1). Total scores ranged 
from a low of 95 to a high of 172, with a mean of 
134.3. The mean score per item was 4.48, 

Intelligence. Performance on the standard Naval 
General Classification Test (GCT) was used as a 
measure of intelligence. Conventionally, scores on 
this test are converted to T scores, forming a dis- 
tribution with a mean of 50 and standard deviation 
of 10, In the present sample, GCT scores ranged from 
33 to 67, and the mean score was 56. 

Manifest anxiety. The Taylor scale of manifest 
anxiety (22) was administered to the Ss. Scores 
ranged from 2 to 22, with a mean of 8.9. 

Psychosomatic Inventory. This inventory, con- 


2Since the F scale (authoritarianism) and the E 
scale (ethnocentrism) have been shown to correlate 
.77 (1), they are frequently used interchangeably 
and results are often generalized from one scale to 
the other. Also, the concepts of “rigidity” and “in- 
tolerance of ambiguity” have been used interchange- 
ably (11, 12), and both have been applied to high 
authoritarians. In the present report we will not at- 
tempt to differentiate between these concepts. 
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structed by McFarland and Seitz, is designed to pro- 
vide a measure of neuroticism (17). High scores in- 
dicate normality and low scores indicate neuroticism. 
Scores for the present Ss ranged from 91 to 383, 
with a mean of 292.7. 

Reactions to ambiguous auditory stimuli. The Az- 
zageddi Test was administered; this test is an audi- 
tory projective technique consisting of passages of 
spoken communication containing contradictory and 
irreconcilable statements and ideas (4, 5, 7). Al- 
though each statement, by itself, is meaningful and 
coherent, when several statements are intermingled 
in a passage of speech, there is much confusion and 
contradiction inherent in the passage. Consequently, 
the S is confronted with confusing and contradic- 
tory ideas and asked to recall as many of the ideas 
as he can from the passage. The total number of 
phrases and statements in the test is 112. The num- 
ber of items recalled by the present Ss ranged from 
a low of 12 to a high of 60, with a mean recall score 
of 33. After hearing, and recalling, the eight pas- 
sages which constitute the test, the Ss were pre- 
sented with sheets on which they could indicate 
their personal reactions to the auditory projective 
test. On 6-point rating scales, they indicated the 
degree of ambiguity they perceived in the spoken 
material, and the degree of satisfaction (liking) °° 
dissatisfaction (disliking) they experienced while at- 
tempting to cope with the task. 


Results 


Table 1 presents the product-moment cor- 
relations among the various experimental 
measures. Since these findings will be com- 
pared with those obtained by Davids with 
university Ss, we will first review briefly the 
significant findings from that investigation 
(4). It was found that the F scale correlated 
.69 with manifest anxiety, and — .57 with 
scores on the Psychosomatic Inventory. In 
both cases these coefficients, which are sig- 
nificant beyond the .01 level, indicate that 
the college students who were high on au- 
thoritarianism tended to score relatively high 
on measures of neuroticism. Also, students 
who were high on authoritarianism had lowe" 
grade-point averages (— .40) although there 
was no significant relation between grade- 
point averages and neuroticism. 

It is evident from Table 1 that the patter” 
of intercorrelations obtained with the prese” 
sample of Naval trainees is quite different- 
Here, as predicted, there is no significant Te 
lation between F-scale scores and either meas 
ure of neuroticism, Again, however, there is 3 
significant negative correlation between the 
scale and intelligence which, for this samp! 
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Table 1 $ 
Product-Moment Intercorrelations Among Experimental Measures 
(N = 48) 
Intelligence Auditory Manif 
Measure (GCT) test a eee 
F scale — 24 
l s —.10 
Intelligence (GCT) +.37** E BEE 
Auditory test 4.08 toe 
—.26* 


Manifest anxiety 


Rs Significant at the .05 level for a one-tailed test. 
ignificant at the .01 level for a one-tailed test. 


1 uted by GCT scores. This finding is in 
ae with results reported by several previ- 
“cae ae who have found significant 
oa ions between authoritarianism and vari- 
a measures of intelligence (1, 3, 4, 5, 13). 
should also be noted that there is no evi- 
sae of a relation between manifest anxiety 
d intelligence in the present sample. This 
sien agrees with those reported in some 
a (4, 5, 6, 20, 21), but is at variance 
certain other results (14, 16, 19). 

phic) a further test of our predictions we com- 
a the significance of the difference be- 
nae the correlations of the F scale and the 
Ver toticism measures obtained from the uni- 
te Ss and the Navy Ss. For both meas- 
Ga S of neuroticism the difference between the 
trelations obtained from the two samples is 
Tapa cant beyond the .05 level of confidence. 
a le 2 shows the means and variances ob- 
age by the Navy Ss and the university Ss 
aye F scale, Manifest Anxiety Scale, and 
aA chosomatic Inventory. Statistical analyses 
these findings show pronounced differences 


between the two groups. As we would expect, 
the Navy personnel, as a group, are much 
higher on authoritarianism and much lower 
on both measures of maladjustment.* 

Let us now examine the relation between 
F-scale scores and the Ss’ performance in re- 
sponse to the ambiguous auditory stimuli. 
Taking as an indirect measure of ambiguity 
tolerance the number of ideas recalled by the 
Ss, the correlation of — -10 shown in Table 1 
indicates that scores on the F scale are not 
correlated significantly with intolerance of 
ambiguity. In order to make a more direct 


8Since the Ss in this study were trainees in Sub- 
they constitute a highly selected 
d that the selected nature of 
mitigated against a relation- 
ship between authoritarianism and neuroticism, it 
should be noted that while the Naval Ss are lower 
on neuroticism scores than the college samples, they 
are considerably higher, as a group, on the F scale. 
These mean differences between the two samples and 
on the two dimensions is in itself evidence against a 
universal positive relationship between authoritarian- 


jsm and neuroticism. 


marine School, 
group. Lest it be argue 
our sample may have 


Table 2 


Differences Between 


Naval Enlisted Men and University Students on Measures of 
‘Authoritarianism and Neuroticism 


Navy Ss University Ss 
(N = 48) (N = 20) 
ee a 
Measure Mean Variance Mean Variance F i 
pale 1343 268.6 89.0 354.6 132 ose 
anifest anxi | 8.9 22.6 20.8 64.3 2.85** 6.26** 
S anaa Fe 292.7 3583.4 137.7 25730.9 7.18** 4.20** 


* 
Significant beyond the .01 level. 
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test of the relation between authoritarianism 
and ambiguity tolerance, the Ss were dichoto- 
mized into a high group and a low group on 
the basis of their F-scale scores, and this 
classification was related to personal reactions 
to the auditory test. Chi-square tests of asso- 
ciation, corrected for continuity, showed no 
significant relation between the F scale and 
the Ss’ ratings of the test material as either 
ambiguous or unambiguous (x* = 87), and 
no significant relation between the F scale 
and the Ss’ ratings of whether they liked or 
disliked the auditory test (x? = 1.02). Thus, 
neither the indirect measure of tolerance of 
ambiguous spoken passages, based on the 
number of ideas recalled, nor the direct meas- 
ures, based on self-ratings, were found to be 
associated with authoritarianism. In this re- 
spect, the present findings with Naval trainees 
duplicate the previous results obtained by 
Davids with university undergraduates. 


Discussion 


The above results show that Navy men are 
more authoritarian, yet less neurotic, than 
the university students. Even though egali- 
tarian attitudes characterize university un- 
dergraduate cultures, as a group, the students 
are less well adjusted than the Navy enlisted 
men who are high on authoritarianism, but 
do not seem to suffer from undue anxiety or 
neurotic symptoms. That is to say, in the 
university social setting where Ss tend to be 
low on authoritarianism, they tend to be rela- 
tively high on neuroticism and there is a posi- 
tive association between being high in both 
dimensions. In the social setting of a military 
installation, however, and probably in many 
other nonacademic environments, the stand- 
ard of reference on authoritarianism is quite 
high. And in such a setting there seems to be 
no relation between authoritarianism and neu- 
roticism (10, 18). 

The above results are consistent with the 
hypothesis that relations between authori- 
tarian ideology ard neuroticism are deter- 
mined to a large degree by sociocultural fac- 
tors. There are several reasons for suspecting 
that sociocultural factors might influence this 
relationship. One is to be found in the gen- 
eral attitudes of liberalism and nonprejudice 
that characterize the subcultures of most uni- 
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versities. Almost by definition the neurotic is 
less apt to adapt to and assimilate the atti- 
tudes and values of teachers and classmates 
in his social environment. His social percep- 
tions are likely to be less sensitive and real- 
istic. Moreover, aggressive and hostile tenden- 
cies may lead him to actively reject the group 
attitudes and values. Placed in an environ- 
ment where it is socially rewarding to express. 
liberal democratic attitudes, the neurotic is 
less apt to assimilate than the so-called nor- 
mal individuals. In a social setting such as a 
Naval training station, however, in which 
there is no premium placed on liberalism 
and egalitarianism, it seems quite likely that 
neuroticism would not be a correlate of au- 
thoritarianism. 

Our failure to find a relationship betwee? 
F-scale scores and several measures of toler- 
ance for ambiguity must take its place along 
with an increasing number of other investiga 
tions that have failed to confirm or replicate 
previously reported research on authoritarian- 
ism (2, 4, 5, 8, 10, 15). 


Summary 


The main purpose of the study was the a 
tempt to clarify some of the confusion 4” 
contradiction that is currently found in bs 
research area concerned with authoritarian- 
ism and personal adjustment. The different 
results reported by Masling with nonuniver” 
sity Ss and by Davids with university Ss sug- 
gested that the sociocultural setting in which 
Ss are examined might well have a significant 
influence on relations between the varia jes 
of authoritarianism and neuroticism. Althoug 
a group of Naval enlisted men examined ™ 
the sociocultural setting of a military insta” 
lation were found to be higher on authori 
tarianism than were a group of university i 
they tend to be significantly lower on mea 
ures of neuroticism, And with the military 
personnel there was no significant relatio” 
between authoritarianism and neuroticis™ 
whereas with the university Ss, there WaS z 
significant positive association between Da 
variables. 


à : inees 

Moreover, using the Naval enlisted rane 
as Ss, there was no relation between author 
tarianism and intolerance of ambiguous 2" 


tory stimuli, This finding, which is coo?” 


~> 
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di ; À : 
ae ee based on authoritarian 
ous Bat rae is in keeping with previ- 
was found i ndings with university Ss. It 
was siettifics owever, that authoritarianism 
. nificantly correlated with a measure of 


Intelli; 
lligence, a finding that fits well with re- 


sults _ ae 
of previous studies. Also, in the present 


Stud; A 
not £3 Manifest anxiety and intelligence were 
und to be associated. 


Receiy 
ed January 8 
arly Publication, , 1957. 
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Spoken and Written Vocabulary; Their Relation to a 
Standard Vocabulary Test, Intelligence and Anxiety’ 


Maurice W. Sullivan and Allen D. Calvin 
Hollins College 


A study was designed to investigate the re- 
lationship between various measures of vo- 
cabulary, intelligence, and anxiety. 

Method. The Ss were 40 female under- 
graduate students from Hollins College. Otis 
higher form intelligence tests and the Taylor 
A scale were administered in group form. On 
a different day, all Ss were asked to write an 
essay on success which had to be at least 300 
words long. Each S was individually inter- 
viewed in the psycholinguistics laboratory 
where she was asked to discuss five campus 
topics; e.g., should the college eliminate Satur- 
day classes. Conversations were tape recorded 
through a concealed microphone. Twenty Ss 
were interviewed by a professor (MWS), and 
the other twenty by a student. At the con- 
clusion of the interview each S was given the 
Wechsler vocabulary test. 

Analysis. The following scores were ob- 
tained for each student: (a) Otis IQ, (b) 
Wechsler vocabulary, (c) A scale, (d) num- 
ber of written words, (e) for every fourth 
written word the word frequency was obtained 
from the Z list of the Thorndike-Lorge Teach- 
ers Word Book, (f) every fourth written word 
was analyzed, and a count was made of the 
number of words which fell outside of the M 


1An extended report of this study may be ob- 
tained without charge from Maurice W. Sullivan, 
Hollins College, Va., or for a fee from the American 
Documentation Institute. Order Document No. 5098, 
remitting $1.75 for microfilm or $2.50 for photo- 


copies. 
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group (words in the M group occurred 1,000 
times or more on the Lorge magazine count) 
and the number was divided by the S’s total 
number of written words analyzed, (g) num- 
ber of oral words, (%) for every fourth oral 
word the word frequency was obtained from 
the L list of the Thorndike-Lorge Teachers 
Word Book, (i) every fourth word was ana- 
lyzed, a count was made of the number of 
words which fell outside of the M group, and 
the number was divided by the S’s total num- 
ber of oral words analyzed, (j) a ratio of 
verbs to adjectives was obtained using every 
fourth oral word. 

Results. The oral word frequency, written 
word frequency, and verb to adjective ratio 
all had to be discarded because of lack of 
reliability. Intercorrelations of all reliable 
measures were computed, and the following 
significant correlations were obtained: vo- 
cabulary test and amount of written words 
37 (p< .05); Otis scores and vocabulary 
test .43 (p < .05); amount of non-M oral 
words and A-scale scores — 45 (p < .01). 

A Holzinger and Harman cluster analysis 
was computed. The variables, amount 0 
written words, amount of written non-M 
words, amount of oral words, vocabulary test, 
and Otis scores, formed a cluster with a 
Coefficient of 1.86; reflected A-scale scores 
and amount of non-M oral words formed an- 
other cluster with the B coefficient of 5.29. 
Brief Report. 

Received December 5, 1956. 
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The Minnesota Multiphasic Personality In- 
ventory is perhaps the most widely used per- 
Sonality inventory in which the ultimate cri- 
terion for the selection of items was purely 
empirical. The validation was carried out 

blindly” according to whether each item dis- 
-Criminated between a sample of psychotics of 
Specified types and a sample of “normals.” 
he question arises in such cases whether the 
i of such a test can be generalized beyond 
e geographical and temporal limitations of 
he validating population (Minnesotans in the 
Middle 1930's). 
ey speaking, this question can only be 
ly answered by replicating the validation 
Procedure over a wide sample of subcultures. 
ae light can, however, be thrown indirectly 
n the generality of the test by comparing 
he mean scores obtained on comparable 
Populations in different cultures. This present 
haber carries this out using college students 
k Australia and the U.S.A. as the compari- 
n populations. 

Subjects. The U. S. results are based on 
BI integrative studies by Goodstein (4) and 
Š ack (quoted in 4) which summarize the 
sults of a number of MMPI investigations 
a male and female undergraduates respec- 
ively, These two integrative studies have also 
ĉen supplemented by a later study by Clark 
(2). Thus the U. S. college results are based 
u samples tested in various regions of the 
6 S.A., mainly in state colleges, and repre- 
nting varied academic fields.” 

b `The completion of this study was made possible 
from the Carnegie Corporation 
hor’s thanks are given to Ron- 
for hand-scoring the 


Jennings and Iain Wyatt 


dicate whether he used the K 


ae 
(>, Clark does not in e i 
r studies, including our 


0 
o tection, All of the othe 
n, used this correction. 


A Cross-Cultural Comparison of the MMPI 


Ronald Taft 


University of Western Australia 1 


The Australian subjects were first-year psy- 
chology students at the University of West- 
ern Australia (state) in 1953 and 1955. Per- 
sons who had not been educated in Australia 
or Britain were excluded and there were 65 
males and 67 females. The subjects took the 
test as part of their laboratory training; they 
were informed that the test was not compul- 
sory and that the results would be used for 
research purposes by the writer who was one 
of their instructors. The test was not anony- 
mous but there were only two protocols that 
were too incomplete to be used. 5 

In this type of study it is important to dis- 
cuss the comparability of the samples. The 
Australian sample may differ from at least 
some of the U. S. samples in the following 


ways: 
1. The students had elected psychology as one of 


their courses. 
2. The average age, especially of the males, was 


high owing to the presence of a number of part- 
time students. The males averaged 26.6 years (7, 8.2) 


and the females 20.8 (7, 7.3). 
3. As a result of the selective method of entry to 


the University there are few subjects with relatively 
low IQs. The mean IQ on a group test of entering 
freshmen is 124 and only approximately 15 per cent 


fall below 116. 

In the discussion section of this paper we 
shall point out why we consider that these 
biases in the Australian sample have not af- 
fected the findings. 


Results 


The means and standard deviations of the 
male and female samples are presented in 
Tables 1 and 2 respectively. 

The only significant differences between the 
U. S. and Australian means are on Mf; both 
the male and female Australians come out 
more feminine. On the male comparisons, the 
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Table 1 
Comparative MMPI Results for U. S. and Australian Samples—Males 
3 3 4 | 
U. S. Males* 2 West Australian West Australian é 
(N =5742) Range of Males Males aged 
means in (V=65) 17-24 years t test 
Median Median the ten (N=30) 3 vs. 1 
Scale Mean SD studies Mean SD Mean (t > 1.0) ? 
? z = — 69 138 11 | 
L = = — 36 20 3.5 
F = = — 4.5 3.2 4.9 
K 14.5 4.6 13.8-16.4 15.3 44 15.9 ii 
(Raw score) 
Hs 52.5 8.3 50.5-54.1 53.0 8.1 52.3 = 
D 53.0 10.5 50.2-53.9 55.0 10.5 57.6 1.5 | 
Hy 55.5 7.8 52.9-52.8 57.4 8.0 57.9 1.9 4 
Pd 56.0 9.9 54.3-58.1 57.4 11.9 56.1 = 
Mf 58.0 10.2 54.9-62.7 62.7 9.6 65.0 3.9 
Pa 53.0 8.0 50.0-54.1 53.8 9.8 54.6 _ 
Pt 56.4 10.2 53.6-58.0 54.8 9.3 57.9 1.0 
Se 56.9 10.5 52.4-57.2 56.4 10.3 59.1 — 
Ma 58.1 10.1 55.8-60.2 57.5 10.3 57.1 = 
* The U. S. figures are based on a combination of ten studies; nine which were integrated in (4) and a tenth (2), which recorded ' 
results derived from 707 college undergraduate males. 
Australian mean is outside the range of the SDs are significantly different (0.05 level) 
ten U. S. means only on the D scale. on Pd and Pa for the males and Pa and Sc 
The variability of the Australian subjects for the females. From the practical point of 
tends to be higher than the American and the view it is also important to compare the 
Table 2 i 
Comparative MMPI Results for U. S. and Australian Samples—Females 
3 4 
West Australian West Australian 
1 2 Females Females aged 
U. S. Females* U. S. Females} (N=67) 17-21 years t test 
(N=5014) (N=1817) (N=53) 3 vs. 1 
Scale Mean SD Mean SD Mean (t > 1.0) 
? = = 4.7 13.3 53 
L S = 4.4 2.3 4.6 A 
F — = 4.6 3.1 4.6 
K 15.5 = 14.8 44 14.7 1.5 
(Raw score) 
Hs 49.0 7.0 50.4 8.0 50.1 13 
D 49.0 8.5 51.0 8.2 50.6 18 
Hy 53.5 7.75 53.7 8.2 53.2 ak 
Pa 54.0 9.0 54.3 9.8 53.7 se 
My 50.5 9.25 47.3 9.2 47.2 29 
Pa 53.5 7.75 52.4 10.0 52.1 E 
Pt 53.0 8.0 54.0 9.2 54.3 via 
Se 55.0 7.5 56.3 10.2 56.5 1.0 
Ma 55.5 10.25 55.2 10.7 54.5 K 


5 Š ; re 
* The U. S. figures represent the median mean of 15 reported studies which were integrated by Black. The figures wert 
mated to the nearest 0.5 from the graph appearing in (4, p. 440). Further details of the integrative study ‘could not be obt# 


esti- 
nede 


+ These figures represent the mean (weighted) SD from the four studies that were available (2, 3, 5, 6). 


| 


A Cross-Cultural Comparison of the MMPI 


163 


Table 3 
Comparative Levels of “Abnormal Scores” for U. S. and Australian Samples * 


Males Femal 
remaies 
T 70 Score Per cent over 70 T 70 Score Per cent 70 
bas over 
West West 
Scale U.S. Australian U. S.f Australian U.S: AA U.S A wer 
. D. ustrahan 
Hs 69 69 2 
28 1.5 63 66 
2 74 76 54 9.2 6 7 oe ae 
Hy 71 73 25 a 69 71 29 
A 76 81 8.0 15.4 72 74 44 3.0 
¥ 78 82 9.8 26.2 69 66 4.0 15 
= 69 73 3.0 7.7 69 72 3.8 45 
A 77 73 9.8 4.6 69 72 3.0 45 
s 78 77 7.8 10.8 70 77 6.0 9.0 
a 78 78 9.8 12.3 76 77 9.5 7.5 


* 
Based on the means and SDs reported in Tables 1 and 2. 


+ A weighted mean combining the resu 


poten of the subjects at the “abnormal” ex- 
reme; i.e., the T 70 or plus 2 SD level. These 
data are presented in Table 3. 
F The level of the T 70 scores differs by 5 or 
ae on Pd for the males and D and Sc for 
ae females, The Australian males obtain sig- 
S: cantly more (0.05 level) abnormal scores 
n Hy, Pd, Mf, and Pa, but the females do 
not differ significantly on any of the scales. 
ifty per cent of the Australian males obtain 
at least one abnormal score, compared with 33 
Per cent of a U. S. sample (3), while the fig- 
ures for the females were 26 per cent and 27 
Per cent respectively. 


Discussion 

American and Australian 
nly on Mf. At the same 
that this scale has the 


ighest range of all scales over the ten U. S. 
Samples, and the Australian mean is within 

at range. It would seem that this scale is the 
One most susceptible to cultural influences, 4 
Conclusion which is consistent with the mode 
9f constructing the scale in the first place, 
le. distinguishing men’s and women’s re- 
SPonses rather than the responses of some 
8toups discriminated on personality criteria 


as in the other scales. f 
Comparing the American and Australian 
Samples at the “abnormal” level of the scales, 

e Australian males tend to score higher on 


Yy, Pd, Mf, and Pa, while the females are 


‘ The means of the 
ollege samples differ o 
Ime it is worth noting 


Its reported in (3) and (5). 


N = 1158 males, and 473 females. 


higher on D and Sc. If we judge the means 
and the T 70 scores from a practical point of 
view, we might call scores “equivalent” when 
the differences are less than 5 points (half a 
standard deviation). Excluding also the Mf 
scales which differ significantly, we find that 
we can call seven of the male scales equiva- 
lent (not Mf or Pd) and six of the female 
scales (not Mf, D, or $e). 

By and large, then, the scales “hold up” in 
the Australian cultural setting. Let us now 
consider the implications of the differences 
and resemblances which we have found. There 
are three main factors which may possibly 
lead to differences: 

1. Differences in the relationship between 
the college samples and the general American 
or Australian populations; 

2. Variations in the psychological signifi- 
cance of the items according to the cultural 
background of the subjects; and 

3. Personality differences between Austral- 
ians and Americans. 

In the introduction, some possible biases in 
the Australian sample were suggested, but 
there is some strong circumstantial evidence 
that these have not caused any of the differ- 
ences found. The Australian subjects were all 
taking a course in psychology, but Sopchak’s 
results (5) based on a similar sample do not 
show any of the variations from the other 
U. S. studies that were shown by the Aus- 
tralian subjects. The IQs of the Australian 
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subjects may also have been higher than some 
of the U. S. samples, but the differences, re- 
ported by Applezweig (1), between groups 
split on IQ at 115 show that the selectivity 
of the sample on intelligence could not ac- 
count for the differences on the MMPI. On 
the contrary, the subjects with the higher 
IQs had fewer abnormal scores on the scales 
on which the Australians had the greater 
number of such scores. The Australians were 
also possibly older than comparable U. S. 
groups, but the results presented in Tables 1 
and 2 for the younger subjects show that the 
differences found could not be attributed to 
this bias. 

We must look, therefore, to explanations 2 
and 3 for the differences that were found; 
these are the psychologica! “pull” of the items 
and personality differences between the two 
groups of subjects. Whether only one of these 
influences or both were operating, we cannot 
say on the data available. 

In the instances where no difference is 
found between the Australian and American 
results, it is possible that two or three of the 
possible causes of differences were operating 
and counterbalancing each other. It seems 
more parsimonious, however, to assume that 
none of them was operating. Therefore, cul- 
tural differences do not appear to have af- 
fected the psychological significance of the 
equivalent items; namely, Hs, D, Hy, Pa, Pt, 
Sc, and Ma for males, and Hs, Hy, Pd, Pa, 
Pt, and Ma for females. On the other scales 
where differences were found we cannot de- 
cide whether cultural differences were respon- 
sible without a careful cross-validation study 
over the Australian general population, repli- 
cating the original Minnesota study. 

In conclusion it should be remarked that 
the differences in the cultural setting of 
American and Australian university students 
are not radical, and the results of this study 
should not be taken as a justification for ex- 
tending uncritically the MMPI to vastly dif- 
ferent cultures.° 


3A report has just come to hand (N. D. Sund- 
berg: The use of the MMPI for cross-cultural per- 


Ronald Taft 


Summary 


One test of whether an empirically vali- 
dated inventory is culture bound is to com- 
pare the results of the application of the in- 
ventory to two comparable populations in 
differing cultures. The MMPI was given to a 
sample of students at the University of West- 
ern Australia and the results were compared 
with the means and “abnormal” score levels 
of a number of American college samples. The 
Australian subjects scored higher than the 
Americans on Mf (male and female), Pd 
(males), D (female), and Sc (female). The 
scores on the other seven male and six female 
scales were equivalent and it is inferred that 
the MMPI items on these scales are not cul- 
ture bound, at least within the culture varia- 
tion studied. Where differences were found 
it is impossible to decide on the evidence 
whether they were determined by true per- 
sonality differences between the two groups 
of subjects, or by differences in the psycho- 
logical significance of the items from one cul- 
ture to the other. 


Received July 23, 1956. 
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_ The April 1953 issue of this Journal car- 
ried an article by the present authors entitled 
Response to the Human Face as a Standard 
Stimulus” (1). The paper examined several 
hypotheses relating to subjects’ responses to 
pictures of people. Paul F. Secord (personal 
communication) pointed out that the find- 
Mgs were indeterminate due to the manner 
in which y? was applied to the data. Conse- 
quently, it was decided to retest the hypothe- 
a of the earlier paper with an experimental 
ee yielding data that could be subjected 
© a definitive analysis: The hypotheses are 
Stated in the following questions: (2) Do male 
mad female subjects give a different propor” 
ion of “like” responses to the photographs? 
o Do the male and female subjects respond 
q ërentially to the two sexes represented by 
Š e pictures? (c) Do the male and female 
wubiects respond differentially to generations 
Ti age groups represented by the pictures? 

is paper presents the new experiment, com- 
Pares the findings with those of the previous 
Study, and discusses the results and conclu- 


Sions of the present work. 


Method 


h In this study the stimuli were pictures of 
man faces representing both sexes and, dif- 
rent generations or 28° groups. Subjects 
ere male and female college students. The 
Tesponse consisted of the subject indicating 
p, The authors wish to express their appreciation to 
T. C. I. Bliss Connecticut Agricultural Experiment 
tation and Yale University, for his very valuable 
istance in the design and analysis of the present 


Deriment. 


and 


Carroll E. Izard 


General Electric Company 


Roland R. Tougas 


University of Akron 


whether he “liked” or “disliked” the indi- 
vidual in the picture presented to him. 
Subjects. The sample, 60 males and 60 fe- 
males were drawn from the ROTC unit and 
the College of School of Nursing at a mid- 
western university. 
Selection of pictures. Over one thousand 
btained on the main street of 


pictures were 0 
Syracuse, New York. Brief information ob- 


tained about the people who volunteered to 
contribute their pictures indicated that they 
were from all walks of life. From this pool of 
photographs about 250 had been preselected 
according to the following criteria: (a) full 
face view; (b) nonemotional expressions; 
(c) equal number of males and females; (d) 
age variation. For this study a series of 60 
experimental pictures were chosen at random 
from this group of photographs with the re- 
striction that both sexes and three age groups 
(roughly 2-5; 18-25; and 45-50) were 
equally represented. 

Administration. The pictures were pre- 
sented one at a time to each subject. The in- 


structions were as follows: 


We have here & series of 60 portraits of people’s 
faces. Obviously, when we are confronted with a 
stranger, looking at him and at his face particularly, 
we get certain first impressions about the person. In 
this research, we are seeking for your first impres- 
sion. We will show you the different pictures one at 
a time; some of them will impress you as very like- 
able people whose company you would like. Others 
may impress you as people you wouldn’t care to 
meet or associate with at all. What we want you to 
do is to tell us as we show you each picture, whether 
you feel the person is one you “like” or one that 
you «dislike”” Sometimes you may feel uncertain 
about whether you like or dislike the picture of the 
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person, but we want you to decide for every picture. 
It is very important to remember, however, that we 
are interested in your first impressions; so Tl not 
let you look at the picture very long—only about 
three or four seconds. When I turn each card over, 
all you need to do is to indicate whether your first 
impression is one of “like” or “dislike.” Do you have 
any questions before we begin? 

It should be noted the above procedures 
contain two changes from the original study: 
individual instead of group administration 
and instructions limiting responses to two 
categories (“like,” “dislike”) instead of four 
(“very likeable,” “mildly likeable,” “mildly 
disliked,” “definitely disliked”). These changes 
were required by the experimental design. 

Experimental design and procedure. The 60 
experimental pictures represented an equal 
number of males and females. There were 20 
pictures (10 male and 10 female) for each of 
the generations—young, peer, and old. The 
age ranges for the three groups were roughly 
estimated at 2 to 5, 18 to 24, 45 to 50. The 
60 pictures were divided at random into five 
sets of 12 with the restriction that the two 
sexes and three generations be represented 
equally in each set. 

A latin square design was utilized since it 

was expected that order of presentation might 
be a complication. The design was essentially 
a random 5 X 5 latin square in which each 
cell was a random 12 X 12 latin square. The 
12 pictures in each set were assigned at ran- 
dom to the letters A through L, which in turn 
formed the “treatments” or individual units in 
a 12 x 12 latin square. Each set was as- 
signed to the letters a through e of the 5 x 5 
latin square. Actually, five random 12 x 12 
latin squares were made up for each set, and 
the resulting 25 12 X 12 latin squares were 
assigned at random to the cells of the 5 x 5 
latin square. This made a single 60 x 60 
latin square, the 60 columns representing sub- 
jects, the 60 rows order of presentation of the 
pictures, and the 60 “letters” the complete 
series of pictures described above. Sixty male 
subjects and then 60 females were assigned 
at random to the columns, and since the same 
pattern was repeated for males and females, 
the 60 pairs of subjects received the pictures 
in a different random order. 

The “dislike” and “like” responses of each 
subject to the pictures as presented were 
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scored as O and 1 respectively. Since each 
entry in the two identical 60 x 60 latin 
squares was either 0 or 1, the usual analysis 
of measurements could not be applied. In- 
stead, three independent analyses based upon 
the column totals, row totals, and picture to- 
tals were computed for male and female sub- 
jects separately. For the analysis of variance, 
each total was converted to a percentage of 
60 and then transformed to angles by the in- 
verse sine transformation (2). 


Analysis and Results 


The analysis of variance based on column 
totals and presented in Table 1 tested the sig- 
nificance of the difference in response of male 
and female subjects to the photographs. The 
interaction or error variance was larger than 
that for pairs and that for male vs. female. 
Thus, there was no difference in average re- 
sponse among pairs or between sexes of sub- 
jects. However, the combined mean square 
with 119 degrees freedom showed a variation 
between subjects from 5 to 6 times larger 
than would be expected by random sam- 
pling from a single binomial population (o° 
= 820.7/60 = 13.68). 

The analysis of variance based on row to- 
tals tested for trends in the proportion of 
“like” responses as a function of the order of 
presentation. Separate trends were computed 
with orthogonal polynomials for a parabola 
from the row totals averaging the male and 
female subjects and from the corresponding 
differences between male and female sub- 
jects. Since residual variances from the sepa- 


Table 1 


Analysis of Variance Based on Column Totals (in 
Angles) Testing Significance of Difference 
Between Pairs and Sexes of Subjects 


Mean 
Source df square F 
Pairs of subjects 59 69.70 80 
Male vs. female subjects 1 37.30 A3 
Sex X pairs 59 86.63 1.00 
Total 19 7825.69" 
Binomial variance w 13.68 1.00 


* Significant at .05 level. 
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Table 2 


Analysis of Variance Based on Row Totals (in Angles) 
Testing Significance of Any Trend in Proportion 
of “Like” Responses upon Position of 
Picture in Order of Presentation 


Mean 
Source df square F 
M—F (Male—Female) 1 43.44 
Order of presentation 
Linear (M-F) 1 69.32 7.53* 
Quadratic (M+F) 1 9.00 98 
Linear (M—F) 1 4.78 52 
Quadratic (M—F) 1 1.67 18 
Error 114 9.21 1.00 .67** 
o 13.68 1.00 


Binomial variance 


va Significant at .05 level. 

Significant at .01 level. 

rate analyses of the totals and the differences 
were homogeneous, they were pooled in a 
single error term and the combined results 
are presented in Table 2. The average linear 
effect was significant (F= 753; 6< 01); 
indicating the subjects tended to give more 
“like” responses as they progressed through 
the series of 60 pictures. The tendency to 
give more positive responses as the experi- 
ment progressed, however, was not signifi- 
cantly different for males and females, as can 
be seen from the linear term for males minus 
females (M—F; F=-52)- Both of the 
quadratic terms measuring simple curvature 
in the trend, were smaller than the error 
Variance. The error term in Table 2 repre- 
Sents within-subjects variation and is not ap- 
propriate for testing the difference in the 
average response of the male and female sub- 
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jects (see Table 1). However, the error within 
subjects was significantly smaller ( = .001) 
than that expected for random binomial varia- 
tion, apparently reflecting the individual con- 
sistency of response. 
_ The third and principal orthogonal factor 
in the 60 X 60 latin square of initial O and 1 
scores was that represented by the letters or 
picture stimuli. There were an equal number 
of pictures in the six categories in Table 3, 
and the 10 pictures in each category gave the 
“like” response totals in the first two columns 
of the table. Each picture occurred once but 
only once in each position of presentation 
from 1 to 60 and before each of the 60 male 
subjects, and similarly before each of the fe- 
male subjects. From the angles for the 60 ex- 
posures of each picture to subjects of the 
same sex, totaled by categories in the next 
two columns of Table 3, analyses of variance 
showed no differences between the five sets of 
12 pictures used in setting up the 60 X 60 
Jatin square, either on the average or in their 
interactions with picture category. Accord- 
ingly, the variation in angular scores among 
the 10 replicate pictures in each category 
formed the error terms in Tables 4 and 5. 
The average effect of picture category summed 
over both sexes of subjects was computed 
from the totals in the next to last column of 
Table 3, and the differential response of male 
and female subjects from the differences in 
the last column of Table 3. 
The analysis of variance for picture totals, 
the sum of “like” responses for males to each 
picture plus corresponding sum for females 
with both transformed to angles, is presented 


Table 3 


Subjects’ “Like’ 


+” Responses by Pictu 


re Category 


Total “like” responses 


Sum of 10 responses in angles from 


in 600 by 
5 iT Males Females M+F M-F 
Picture Category Males pene 
E 500 609.6 675.5 . 1285.1 —65.9 
Male young 407 394 559.2 544.4 1103.6 14.8 
peer 424 386 574.8 538.0 1112.8 36.8 
old 
431 464 603.2 661.9 1265.1 —58.7 
Female young 465 478 621.5 638.8 1260.3 =113 
e 347 366 505.1 527.6 1032.7 —22.5 
o. 
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Table 4 


Factorial Analysis Based on Picture Totals (in Angles) Summed Over Both Sexes of Subjects 
Testing the Significance of the Average Effect of Picture Category 


| 


Mean 

Source df square F 
Young and peer vs. older generation 1 1617.72 E ba 
Young vs. peer generation 1 433.85 1.40 
Sex of picture 1 26.70 .09 
Sex X young and peer vs. older > pat T 

vs. peer y $ 
a ARES 54 310.59 1.00  22.70** 
Binomial variance w% 13.68 1.00 


* Significant at .05 level. 
** Significant at .01 level. 


in Table 4. By factorial analysis, a separate 
estimate of variance was computed for each 
of the five degrees of freedom associated with 
picture effects (two sexes and three genera- 
tions of pictures). As indicated by the first 
term in Table 4, the subjects preferred the 
pictures of the young and peer groups to those 
of the older generation (p < .05). A study of 
generation totals showed that the greatest 
contrast was young vs. old. Since no other 
picture effect approached significance, these 
preferences were independent of the sex of 
the person pictured. That a very marked in- 
dividuality attached to each picture in the 
responses of all subjects, however, is attested 
by an observed error variance far in excess of 
that expected. 

A similar analysis for differences in pic- 
ture totals (sum of “like” responses for male 


subjects minus the corresponding sum for fe- 
male subjects) is presented in Table 5. The 
first three terms in Table 5 were significant— 
the average difference in response between 
male and female subjects and their differen- 
tial response to the generations represented 
by the pictures. The direction of this differ- 
ence in response is evident from the totals 
for young, peer, and older groups in Table 3. 
Although male and female subjects agreed in 
their preference for their own generation, the 
female subjects preferred pictures of the 
young generation and male subjects’ pictures 
of the older generation, the sex of the picture 
having no significant effect. Male subjects’ 
totals for the young and peer picture groups 
showed very little difference, while female 
subjects’ totals for the three age groups were 
fairly widely separated. The finding of a sig- 


Table 5 


Factorial Analysis Based on Differences Between Male and Female Subjects’ Picture Totals 
(in Angles) Testing the Significance of the Differential Response of 
Male and Female Subjects to Picture Categories 


Mean 

Source df square F 
Sex of subject 1 106.03 5.50* 
Young and peer vs. older generation 1 101.01 5.24* 
Young vs. peer generation 1 186.36 9.67** 
Sex of picture 1 59.08 3.06 
Sex X young and peer vs. older 1 36.58 1.90 
Sex X young vs. peer 1 19.31 1.00 
Error 54 19.28 1.00 1.41 
Binomial variance ag 13.68 1.00 


* Significant at .05 level, 
** Significant at .01 level. 
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nificantly larger number of positive responses 
among female than among male subjects 
seems to contradict the nonsignificance of the 
difference in Table 1. The comparison in 
Table 5, however, has the difference in re- 
sponse of male and female subjects to the 
individual picture as the unit and in con- 
sequence has a smaller more critical error 
variance than that in Table 1 where the in- 
dividual subject was the unit. 

Of especial interest is the much smaller 
error variance for the differences between 
male and female subjects in Table 5 than for 
their sum in Table 4, the latter being 16 
times as large as the former. In fact, the 
variance of the differences is the only one 
which does not differ significantly from bi- 
nomial expectation. This emphasizes again the 
individuality of the pictures used as stimuli. 
Randomly paired male and female subjects 
agreed within the binomial error, after ex- 
cluding differential sex responses to genera- 
tions of pictures, as to their liking for a given 
Picture. Factors associated with each picture 
but not isolated in the present study were 
evidently of critical importance in the ob- 
tained response. 

Although the average difference in response 
among age-sex categories was not significant, 
Some differences between male and female 
Subjects were observed in each of the six pic- 
ture categories. The largest differences be- 
tween male and female subjects were in terms 
Of the greater preference of the female m 
jects for the young female and young mae 
Categories. The next largest difference be- 
tween male and female subjects came ae 
the male subjects’ greater preference for the 
older male category. Male subjects gave more 


fe- 

“like” res to peer males than peer 
ponses to p! iS 

a ubjects preferred peer 
ee a ‘put both these differ- 


males to peer males, 
ences were small. 


Discussion 


incipal findings of the previ- 
Moyen male and female aie 
Bave a different like-dislike ratio pra hee 
Sex categories combined; (b) the = ee 
à group responded differently to the a8 


i he pictures; (c) 
Categories rel resented by t s; (0) 
male aid forse subjects responded with dif 
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ferent like-dislike ratios to four out of seven 
age-sex categories. Further, the previous study 
suggested that age was a more important de- 
terminant of differential preferences than sex. 

In the present study it was found that: (a) 
there was an average difference between male 
and female subjects in their responses to in- 
dividual pictures; (2) the subjects as a group 
responded more favorably to pictures repre- 
senting young and peer generations than to 
those of older adults; (c) male subjects re- 
sponded differently from female subjects to 
the age groups. 

In comparing the two studies, it should be 
noted that in the present study the difference 
between the total numbers of “like” responses 
given by male and female subjects was sig- 
nificant only when compared picture by pic- 
ture. Further, the first study indicated that 
age and sex of picture were important deter- 
minants of preference ratings while the sec- 
ond study showed age to be significant and 
sex not. However, the two studies were con- 
sistent in showing that peer females were pre- 
ferred to peer males and older males to older 
females. In both studies females gave the 
larger number of “like” responses, and the 
order of preference for age groups was the 
same. 

On the average, subjects gave more “like” 
responses as the experiment proceeded. Since 
each pair of subjects received the pictures in 
a different random order, this observed linear 
effect was independent of any particular pic- 
ture or group of pictures. This position effect 
seems to be a hitherto unconsidered bias in 
studies of the present type, and one which 
could affect the responses to the Szondi, TAT 
and similar projective techniques. 

The decrease in number of “like” responses 
as age of the picture group increased sounds 
another note of caution in the use of human 
photographs as stimuli. Interpretations of re- 
sponses to “mother figures” and “father fig- 
ures” might well refer to the cultural pattern 
as well as to individual dynamics. Since male 
subjects responded differently from females to 
the generations represented by the pictures, 
this finding supports the suggestion made in 
the previous study that it might be well to 
consider establishing separate norms for each 


sex. 
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Further study of this different reaction of 
males and females might throw some light on 
the development of cultural-sexual roles, such 
as the point in the life span when this differ- 
ential response begins. , 

The comparison of the observed error vari- 
ances with the variance expected from ran- 
dom sampling from a single binomial popula- 
tion yields some information on the usefulness 
of photographs of human faces as stimuli. 
From Table 1, the ratio of observed to ex- 
pected variance indicates significant variabil- 
ity between subjects in their response to the 
pictures, while the comparable ratio from 
Table 2 shows a greater than chance consist- 
ency of response within subjects. The ob- 
served-expected variance ratio from Table 4 

demonstrates the marked individuality at- 
tached to each picture in the responses of all 
subjects. As shown in Table 5, the error vari- 
ance for the differences between male and fe- 
male subjects is much smaller than that for 
their sum and is the only one which does not 
differ significantly from binomial expectation. 
After excluding differential sex responses to 
generations of pictures, randomly paired male 
and female subjects agreed within the bino- 
mial error as to their liking for a given 
picture. The variability due to pictures is ap- 

] parently of more importance than that asso- 
ciated with subjects. This further emphasizes 
the individuality of the photographs of hu- 
man faces used as stimuli and indicates that 
factors associated with each picture but not 
identified in this study were of critical im- 
portance in the subjects’ responses. Similar 
stimulus characteristics for photographs of 
human faces were discussed by Izard (3) in 
a study which utilized preference ratings and 
verbalized projective responses. 


Summary 


Sixty photographs of human faces repre- 
senting both sexes and three generations were 
individually administered to 60 college males 
and 60 college females. Each pair of ran- 
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domly assigned male and female subjects re- 
ceived the pictures in different random order. 
The subjects gave a “like” or “dislike” re- 
sponse to each of the 60 pictures. The “dis- 
like” and “like” responses were scored O and 
1. For male and female subjects separately, 
row totals, column totals, and picture totals 
were converted to percentages and then trans- 
formed to angles. The results of the analyses 
of variance indicated that: 


1. The pairs of subjects did not differ sig- 
nificantly. 

2. A significant difference in the average 
responses of male and female subjects could 
only be detected when compared picture by 
picture. 

3. Both male and female subjects tended to 
give more “like” responses as the experiment 
progressed. 

4. There was no difference in response of 
male and female subjects to. the two sexes 
represented by the pictures. y 


5. The subjects responded differently to the 
generations represented by the pictures. 


6. Male and female subjects differed in 
their response to the three generations repre- 
sented by the pictures. 


7. A comparison of error variances indi- 
cated a strongly specific response to the in- 
dividual pictures selected at random within 
each age-sex category. 


Received July 16, 1956. 
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There are an estimated 300,000 blind per- 
sons in the United States. Much of the in- 
formation about these people comes from 
anecdotal reports and conjectures. Very little 
can be stated with confidence about the at- 
tributes of the blind, and even less about the 
interplay between blindness and adjustment. 
Basic information is needed, and this study 
seeks answers to two rather broad, funda- 
mental questions: (a) what tests are of value 
in assessing the adjustment of the blind; and 
(b) what characteristics do these tests show 
to be typical of the blind? Answers to these 
queries, based upon explicit rationale and 
careful methodology, can offer some founda- 
tion which may act as a basis for a psychol- 
ogy of blindness. 


Problem 


been written about the 


Much material has 
hrases have only re- 


blind, but descriptive P. r 
cently given way to testing and group 1M- 
vestigations. And, in general, the literature 
reveals that studies have been done too poorly 
or are too conflicting to provide reliable in- 
formation concerning adjustment to blind- 
ness, Barker ef al. (1) indicated the concern 
of many workers with the blind when they 
wrote that most research to date consists 
merely of random exploratory collections of 
data. Worchel (10) discussed “shotgun com- 
Parisons, the limited num 
many studies, and a number of other meth- 


odological errors. AS recently as 1954 Bau- 
man and co-workers “+ - - claim that this is 
the first scientific writing on adjustment to 
blindness” (3, P. 1)- However, their study 
was avowedly “= 1,00 inquire into all the 
likely to have some rela- 


factors which seem 


tionship to adjustment to blindness . . EUNE 
p. 3). One of the best publications, a eae 
tion of papers by Donahue and Dabelstein 
(6), is generally more suggestive than defini- 
tive, and opinions primarily support the pres- 
entations because of the paucity of facts. 
The present study is concerned with the 
problems of just what tests to use in evaluat- 
ing adjustment to blindness; the modifications 
which may be required in the interpretation of 
“sighted” test results with the blind; and any 
unique personality patterns related to blind- 
ness as such. The research was conceived as 
one for practical purposes involving voca- 
tional rehabilitation clients. The 54 blind sub- 
jects of the investigation consisted of 34 males 
and 20 females, at present living in the State 
of Oregon, and probably no different in any 
major respect from the blind elsewhere. Many 
subjects came to Oregon from other states 
both before and after the acquisition of a 
visual impairment. Insofar as the national 
figures are known, the blind population sta- 
tistics for Oregon are proportionately com- 


parable. 


The Experimental Design 


The research is based on a 3 X 3X 2 fac- 
torial design with three replications. Each 
subject was classified in terms of: (a) ad- 
justment, good, fair, or poor; (b) duration of 
handicap, born blind, long-term with visual 
experience, and recent blind; and (c) re- 
maining acuity, little or no vision, and rela- 
tively “good” vision. Three subjects appear 
in each of the possible combinations of the 
above groupings in order to obtain a meas- 
ure of variability within a given subcategory. 
Table 1 shows the factorial design in tabular 
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Table 1 


The Factorial Design in Tabular Form 
(Indicating the number of subjects in each cell) 


Visual categories 


Adjust- GroupI GroupII Group Im 

ment 

rating Tot Tvl Tot Tvl Tot Tvl Total 
Good sey 3° 3 3 3 18 
Fair s 33 3 3 2 3 18 
Poor 3) -S 3 3 3 3 18 
Total 9 9 9 9 9 9 54 


form, and in the following sections each clas- 
sification of the subjects will be discussed in 
detail. 


Adjustment. It has become almost traditional to 
speak of blind adjustment as involving the areas of: 
(a) freedom and ability to travel independently ; 
(b) interpersonal relations and acceptance by so- 
ciety; (c) outlook on life; (d) self concept and self- 
acceptance; (e) work experience and attitudes; (f) 
social participation; (g) acceptance of limitations 
and use of assets; (k) satisfactory grooming and 
hygiene; (i) education and its application. These 
nine areas cover a number of kinds of behaviors and 
types of adjustments so that no measure which 
would attempt to account for all of them would be 
unitary. Taken together they offer an approach to 
evaluating blind adjustment. 

It was rather arbitrarily decided that two state- 
ments exemplifying good adjustment in each of the 
nine areas would be used as the basis for evaluating 
the subjects. Ratings were made on a five-point 
scale; the combined arithmetic means of the raters 
being accepted as the best over-all estimate of a sub- 
ject’s observable adjustment. Without a more defi- 
nite criterion available it is necessary to call upon 
experienced workers in the field to evaluate observ- 
able behaviors, and to use their consensus judgment 
as a measure of functioning adjustment. The eighteen 
statements were reworked until unanimity of con- 
cept was achieved by the judges. The judges were 
persons with years of experience in work with the 
blind and were themselves personally acquainted 
with visual handicap.t All are employed at the Ore- 
gon State Commission for the Blind. 

As the judges’ ratings are basic to this study a 
measure of retest reliability was executed. Practical 
considerations precluded complete rerating, so the 


1 The investigator is grateful for the assistance of: 
Mr. Clifford Stocker, Administrator; Mr. Charles 
Brown, Director of Vocational Rehabilitation for the 
Blind; and Mr. George Howeiler, Supervisor of So- 
cial and Educational Services. 
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judges were asked three months later to rerate each 
subject on a five-point scale of “global” adjustment. 
Again the mean rating was taken to be the best 
estimate of the subjects’ adjustments. In the first 
ratings the subjects were listed in a random order, 
while in the second they were listed alphabetically 
by name as an aid in controlling for a “position” set. 

Despite “global” rerating, the combined judges’ re- 
ratings yielded a coefficient of reliability of .91 indi- 
cating high reliability and consistent judgments. Rat- 
ings were not made unless the judge felt he had 
sufficient data about the subject. In this way guess- 
ing was reduced and a “halo effect” minimized. All 
subjects were not known equally well by all raters; 
eight evaluations had to be based on the combined 
ratings of only two judges, but these were in rea- 
sonably close agreement. 

Duration of handicap. Three groups were distin- 
guished according to the amount of visual experience 
prior to blindness, and the duration of the visual 
condition. Group I: those who have been blind for 
over ten years and who became blind before their 
fifth year of life, and therefore are presumed to have 
minimal retention of visual imagery. Group II: those 
who have been blind for ten years or more and who 
became blind after their fifth year of life, and who 
are presumed to have residual visual imagery. Group 
III: those who have been blind for less than five 
years and who have had many years of visual ex- 
perience, and whe may be expected to show some 
traces of the adjustment needs engendered by the 
loss of vision. 

Remaining acuity. The two subgroups based upon 
residual vision are the last to be considered in con- 
nection with the factorial design. (Tot): those who 
have a total loss and those with only light percep- 
tion, on the assumption that these conditions pro- 
vide a somewhat homogeneous visual group based 
on minimal useful sight. (Tvl): those who have 
travel vision and those with object perception, On 
the assumption that these conditions are similar visu- 
ally and are qualitatively different from the first sub- 
group. 


Measures of Adjustment 


Subjects’ ratings. In order to measure the 
client’s evaluation of himself a list of 100 
statements was compiled from various sources: 
These items had been used in previous stud- 
jes to measure some phase of adjustment. 
Three major areas were utilized: (a) general 
adjustment; (b) adjustment to blindness; 
and (c) body image. These items were sub- 
mitted to judges with the instruction to choose 
those statements they felt would best discrimi- 
nate degrees of adjustment of blind persons: 
The judges for this list were three experi- 
enced workers with the blind, mentioned 
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previously, the author, and three psycholo- 
gists at the University of Portland.” 


The items selected for this study were those agreed 
upon by at least four of the judges. Whenever pos- 
sible only those items were used which showed 
agreement between the psychologists and the work- 
ers. After pretest trials ambiguous wording was re- 
duced without destroying the intent or meaning of 
the item. In this manner a 34-item self-rating scale 
was devised. 

The subject was asked to indicate, on the basis of 
a seven-point scale, how much each item applied to 
himself. In order to make a response on a seven- 
point scale possible for the blind subjects, a card 
with the seven responses in Grade II Braille down 
the left-hand side, and with large, black print down 
the right-hand side, was given to each subject as a 
memory aid while formulating his answers. Only a 
few subjects could use neither Braille nor print; for 
these the responses were read slowly through after 
each statement. 

The subjects were also 
selves on an over-all or “global” 
one of five categories, from poor to very goo 
they felt best described their adjustment. 


asked to evaluate them- 
basis by selecting 
d, which 


Psychological tests. A selection of tests was 
made so as to include those which: are pres- 
ently used with the blind; sample different 
aspects or levels of adjustment; have sighted 
norms; and are of various construction types 


or theoretical bases. 


Bauman’s (2) Emotional Factors Inventory (EFI) 
Was selected as exemplifying the attempt to measure 
adjustment to blindness. Tt is an inventory type test 
which is new but apparently well received. This test 
Purports to measure the areas of: Sensitivity, So- 
Matic Symptoms, Social Competency, Paranoid 
Tendency, Feelings of Inadequacy, Depression, Atti- 
tudes re Blindness, and a measure of Validity. 

The Minnesota Multiphasic Personality Inventory 
(MMPI) is a commonly used test which was de- 
veloped on clinical populations. It has extensive 
norms and has been applied to the blind. Although 
a method of arriving at a single adjustment score 
may exist, the investigator could not find mention 
of one in a search through possible sources. For that 
reason the scoring method used in this study will be 
clarified, It was decided that maladjustment was in- 
dicated by two major dimensions; amount of devia- 
tion from the midline on the clinical scales, and the 
number of such deviations. By assigning the number 
One to scores falling within the range 41-59 a sub- 


ject scoring within this area on all ten factors would 
~ 


is i dness to: 

2 The investigator expresses his indebte 
Dr, William Botzum, Chairman, Department of Psy- 
chology ; Dr. Gordon Higginson, Director, Psycho- 
logical Services; and Dr. Frank Strange, Staff Psy- 


chologist, Psychological Services. 
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receive a minimum score of 10. In a like manner, 
factors scored in the ranges of 60-69 and 31-40 
would carry a weight of two; scores below 30 or in 
the range 70-79 would be weighted three; and so on 
with scores above 80. Using this scoring procedure 
it was possible to get a single figure representative 
of both number of deviations and their extent. 

The Rotter Incomplete Sentences Blank (ISB) is 
a semiprojective technique consisting of 40 items. 
The author (7) presents norms for normal and mal- 
adjusted populations. Scoring is based on rating the 
completion in terms of expressed conflict, with lower 
over-all scores indicating better adjustment. 

The Sargent Insight Test (Insight) is a new pro- 
jective technique which is applicable to the blind. 
Brief situations, termed armatures, are presented to 
the subject and scoring is based upon the answers to 
the two questions: “What did the person do, and 
why?” and “How did the person feel?” Fifteen situa- 
tions constitute the test, with alternative forms for 
male and female subjects not used in this study. 
Quantification of responses is possible in the areas 
of Affect (A), Defense (D), and the ratio between 
them (A/D). Twelve “feeling categories” are also 
used to classify further all responses scored Affect. 
The author (8) presents data from various groups 
of subjects so that comparisons are possible. The 
test not only aims to reach deeper aspects of the 
personality but it offers a measure of defensiveness 
at the same time. 

The Wechsler-Bellevue Intelligence Scale, Form i 
Verbal Scale, was used to measure the subjects’ in- 
tellectual abilities. Although intelligence was not con- 
trolled by sampling in the study, its importance 
could not be ignored. By statistical analysis the in- 
fluence of intelligence upon the subjects’ scores could 


be controlled if necessary. The only change needed 
for blind subjects is the substitution of the alternate 
questions in the Comprehension subtest as suggested 


by Bauman and Hayes (4). 


Procedure 

The order of test presentation was ran- 
domly determined so that no position bias 
could occur, and the assignment of the sub- 
jects to the cells of the experimental design 
was as random as the design permits. Co- 
operation of the subjects was solicited through 
an appeal to the need for information about 
the blind, and with the understanding that 
the data would only be used in group terms. 
The method of presentation in all testing was 
oral, with the responses scored by the ex- 
aminer in every case. 

In the experimental design sex was not 
controlled as a factor. In order to determine 
whether sex differences existed in age, educa- 
tion, or judges’ ratings the data were sub- 
mitted to ¢ test. In no case could the null hy- 
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Table 2 


Means and Standard Deviations of the 
Measures of Adjustment 


Test Mean SD 
Insight 58.07 14.56 
ISB 130.30 22.78 
MMPI 13.44 2.54 
EFI 5.65 62 
Wechs. 113.56 13.02 
Self-eval. 5.78* 74 
Adj. Rating 3.137 94 


* Seven-point scale. 
+ Five-point scale. 


pothesis be rejected, and it was concluded 
that only chance differences existed. Chi 
square, two-tailed and corrected for continu- 
ity, was employed to test other factors which 
might be affected by sex differences. The fac- 
tors tested were marital status, employment 
status, and years of work experience. Differ- 
ences in marital status and in the employ- 
ment status of the sexes were not large enough 
to reject the null hypothesis. However, differ- 
ences in years of work experience yielded a 
chi square large enough to be significant be- 
yond the 1% level. A tendency is revealed 
for the males to work less after suffering a 
visual loss, and for the females to work more 
at that time. This phenomenon might be 
worthy of more critical analysis than by the 
chi square computed in this study. It is rea- 
sonable to assume, then, that conclusions can 
be drawn equally about the sexes in further 
discussions of the subjects since both sexes 
are similar in the major attributes tested. 


Sidney I. Dean 


Results 


The means and standard deviations of the 
various measures, which will be discussed 
Jater in relation to test scores, are shown in 
Table 2. Intercorrelations between the vari- 
ous measures are presented in Table 3. Four 
of the intercorrelations are significant at the 
1% level, with the EFI and the self-evalua- 
tions showing greatest agreement. These re- 
Jationships suggest that the EFI and the 
Wechsler, particularly in combination, might 
prove to be good predictors of the adjustment 
ratings. To test this suggestion these two 
measures, and the MMPI and self-evaluations 
which were significantly related to the ad- 
justment rating at the 5% level, were em- 
ployed in a multiple triserial (point biserial) 
correlation. The computations, explained in 
Wert et al. (9), resulted in an R, of .465 
which yields an R, of .51 when corrected for 
coarse grouping. This multiple correlation was 
tested by the F-ratio formula (9); with 4 
and 49 degrees of freedom the F of 3.38 is 
significant beyond the 5% level. However, 
this R, is not so large as to account for more 
than about 26% of the variance to be found 
in the adjustment categories. This amount is 
so small as to make the computation of a dis- 
criminate equation a labor of minor produc- 
tiveness for practical predictions. 


Analysis of Measures of Adjustment 


Since this study rests upon the division 
of the subjects into adjustment categories 
through the use of mean ratings by profes- 
sional workers these ratings will be evaluated 
first. It is expected that adjustment cate- 


Table 3 


Intercorrelations Between Measures of Adjustment 


Test ISB MMPI EFI Wechs. Self-eval. Adj. Rating 
Insight — 03 —.06 13 5 
ISB 13 30" s — Soe e 
MMPI =25* —.12 —.28* —:24* 
EFI .28* “62** A 
Wechs. ‘00 .39** 
Self-eval. É .26* 


* Significant at the 5% level of confidence. 
** Significant at the 1% level of confidence. 
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gories based upon the ratings will be well 
differentiated, and interest is centered upon 
differences in rated adjustment among the 
other two classifications. An analysis of 
variance for the judges’ adjustment ratings 
yielded an F well beyond the 1% level as 
expected; however, the time of occurrence 
also yielded an F significant at the 5% 
level. The judges, apparently, tended to favor 
Group II subjects. Since intelligence was 
shown to be significantly related to the ad- 
justment rating, an analysis of covariance 
was employed to determine whether ratings 
were influenced by subjective evaluations of 
intelligence. The covariance between intelli- 
gence and judges’ adjustment ratings yielded 
an F value beyond the 1% level of confidence 
with occurrence and acuity nonsignificant. 
Adjustment accounted for almost all of the 
variation, and intelligence appears to be the 
determinant which made Group II scores 
somewhat higher. 

The subjects’ self-evaluation scores were 
next submitted to an analysis of variance. 
The residual error term accounted for enough 
variability to preclude computation of the F 
ratios, It was concluded, with no significant 
Fs, that the subjects’ self-evaluations were 
not patterned in terms of the adjustment 
Categories nor in terms of the visual group- 
Ings. 

The EFI scores earned by the subjects were 
placed in the factorial design and an analysis 
of variance computed; none of the F ratios 
Was significant. Since a significant relation- 
ship exists between the EFI and the Wechsler, 
an analysis of covariance was made to deter- 
mine whether control of intelligence would 
Permit better discrimination. The adjusted 
Means were so low as to disallow the com- 
Putation of the F ratios. It was concluded 
that the EFI does not discriminate adjust- 
Ment of the subjects as defined by the nine 
areas and evaluated by the judges. It also 
appears that with intelligence controlled the 
Variability is reduced s0 that rather than 
Masking the discrimination powers of the 
EFT, intelligence seems to contribute some- 
thing to the ability of the test to measure 
degrees of adjustment. 


The MMPI was the next 
ered, By placing the previously 


next measure consid- 
mentioned 
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composite scores into the experimental de- 
sign an analysis of variance could be com- 
puted. None of the major categories gave a 
large enough mean square to compute the F 
ratio. With the triple interaction, however, 
there is a significant rejection of the null hy- 
pothesis at the 5% level: This finding is not 
so valuable for practical use as it is for pos- 
sible follow-up. An analysis of covariance was 
not deemed necessary, and it was concluded 
that the MMPI does not adequately differ- 
entiate the subjects in terms of the criterion 
adjustment. 

An analysis of variance was next done with 
the Rotter (ISB) scores. The results indi- 
cated that the total variability was quite 
small and that no category accounted for 
enough of the variance to necessitate compu- 
tation of the F ratios. It was concluded that 
the objective scoring of the ISB does not dis- 
criminate in terms of the adjustment criterion 
used in this study; more value may reside in 
the qualitative data available from the test. 

An analysis of variance was executed on 
the A/D ratio scores earned by the subjects 
on the Insight test. Only one interaction was 
larger than the residual despite the large 
variability, but the resulting F ratio was not 
large enough to reach significance. As the In- 
sight is correlated with intelligence at the 5% 
level, the data were subjected to an analysis 
of covariance. With intelligence controlled no 
variability was larger than the error term. 
The test does not discriminate adjustment on 
the basis of the A/D ratio. 

Intelligence so far has been considered only 
for its possible influence upon the other tests 
with which it is correlated. It was next con- 
sidered in its own right as a possible indi- 
cator of adjustment. The results, however, 
showed no significant F ratios. It was con- 
cluded that intelligence does not vary signifi- 
cantly with the adjustment criterion. It would 
not be expected that visual grouping would 
vary systematically unless a bias had been 
introduced into the study. This affords some 
indication that the design was adequate for 


evaluating the tests. 


Characteristics of the Blind 


The author of the EFI gives various norms, 
the latest of which were based upon an N of 
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200 and presented in 1954. The subjects of 
the present study appear to be above the av- 
erage for these norms, scoring even higher 
than the nonhandicapped persons tested by 
the EFI author. It seems reasonable to con- 
clude that the items used in the various areas 
of the test were assumed a priori to discrimi- 
nate adjustment. However, they show too 
much variability and need more refinement 
for value in individual prediction. 

The MMPI psychographs (short form plus 
full K and Si scales) were plotted for the 
mean scores made by the male and female 
subjects. For both sexes the profiles were 
within normal limits on all areas. For both 
sexes, also, there were three “peaks” which 
occur for the K factor, the Mf score, and the 
Ma score. If this pattern is in any way typi- 
cal of blind subjects, it is in areas which have 
not been emphasized or explored. The only 
divergence in sex patterning was in the area 
of social interests, with women scoring above 
the midline and men below. 

Rotter has suggested that a cutting score 
of 135 on the ISB will correctly identify 75- 
80% of the maladjusted cases. The subjects 
of this study scored higher than the adjusted 
population or the college freshmen norms pre- 
sented by the author, but they scored lower 
than the maladjusted groups. If the reasons 
were known why the college group scored 
higher than the adjusted group it would aid 
in hypothesizing the meaning of the blind 
subjects’ mean. 
The Insight test manual presents only two 
protocols from blind subjects so that only 
casual comparison can be made with the sub- 
jects of this study. However, these Insight 
cases and the subjects of this study resemble 
each other more than they resemble the non- 
blind samples furnished by the test author. 
In both cases the Affect-Defense ratio (A/D) 
is much lower for the blind; the mean for 
feeling or action (A) is relatively low, the de- 
fensiveness (D) approaches the norm group, 
malignancy scores (M) are higher, and ag- 
gressive-passive feeling dominates. In general, 
the subjects of this study are more like the 
clinical groupings than like the control group 
of the norms. It is concluded that the Sargent 
norms should be applied with caution to blind 
subjects. 
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The mean (verbal) intelligence quotient 
was in the bright-normal range, with about 
19% of the subjects earning scores below av- 
erage. This is the usual finding with blind 
subjects on the Wechsler Form I. As a group 
the subjects were lowest on immediate mem- 
ory (likely to be affected by tension), and 
were highest in differentiating essentials from 
nonessentials. The intellectual processes of 
the blind appear to be no different from those 
of the sighted, although tension may be some- 
what more typical of the blind than of the 
general population. 


Discussion 


Although it was anticipated that discrimi- 
nation among the adjustment categories would 
improve as the tests became more “projec- 
tive,” such was not the case. None of the 
tests was able to differentiate good or poor 
adjustment by a direct comparison of single, 
representative scores. It is obvious that the 
less structured tests offer much qualitative 
data which is lost in the use of single scores. 

The present study indicates that the MMPI 
is applicable to the blind without modifica- 
tion. This finding is in agreement with Cross 
(5), and the need for separate blind norm 
tables is not indicated. With the Insight test, 
however, Sargent’s norms (8) differ enough 
to suggest that such norms should be cau- 
tiously applied to the blind. Further investi- 
gation may provide blind norms. The EFI 
and MMPI scores suggest that the blind are 
not paranoid or depressed as a group; 4 find- 
ing at variance with previous assumptions. 
The MMPI further suggests three scores 
which might distinguish the blind. And the 
significant triple interaction in the MMPI 
analysis of variance indicates that adjust 
ments may be differentiated if the variables 
of duration and acuity are controlled. 

The Rotter ISB method of scoring May 
tend to obscure levels of adjustment throug 
a process of cancellation. Its value with the 
blind probably resides in qualitative rather 
than quantitative evaluations. The Insight 
will probably lose its major value if only 4 
single score such as the A/D ratio were to ! E 
used. When subscores are utilized, for 1” 
stance, the patterning is similar to that f° 
hysterics presented by Sargent (8). 
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The present study was rather iconoclastic, 
for the findings indicate that some previous 
studies may not have been as indicative of 
“adjustment to blindness” as the authors may 
have hoped. Since somewhat unusual cate- 
gories of vision were used in this study it 
would seem worth while to investigate fur- 
ther these and other groupings which will 
avoid the arbitrary number values now em- 
ployed. Some leads furnished by this study 
may yield more value than those attempts to 
show that the blind are “really” more mal- 
adjusted than the sighted. 


Summary 


The study was designed to evaluate various 
“representative” tests which are or could be 
used with the blind, and to discover factors 
which might be attributed to blindness. The 
subjects, 34 male and 20 female, were rated 


. for adjustment, divided into groups on the 


basis of duration of handicap, and subcate- 
gorized on the basis of visual acuity. The ex- 
perimental design, then, was @ 3x3x2 
factorial with three replications. 

The tests selected were the: Subjects’ self- 
evaluations, Emotional Factors Inventory, 
Minnesota Multiphasic Personality Inven- 
tory, Rotter Incomplete Sentences Blank, 
Sargent Insight Test, and the Wechsler-Belle- 
vue Intelligence Scale. The judges’ adjust- 
Ment ratings were found to be highly reliable 
by test-retest, and appeared adequate as the 
adjustment criterion. The major results may 
be summarized as follows: : 

1. In general, the various analyses of vari- 
ance and covariance computed from the single 
test scores did not reveal significant differ- 
ences between the subgroups of blind subjects 
on the experimental variables. 


2. A multiple triserial correlation proved 
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to be significant but of limited value in pre- 
dicting behavioral adjustment from the single 
test scores. 

3. Norm comparisons showed that the 
MMPI can be used without modification 
with the blind, and that the Insight pattern- 
ing of answers suggests a cautious use of the 
author’s norms with the blind. 


Received July 6, 1950. 
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Childrearing Attitudes of Emotionally 
Disturbed Adolescents’ 


George Spivack 


Devereux Schools 


Various studies over the past decade sug- 
gest a marked consistency in childrearing atti- 
tudes of mothers of sick children, The pur- 
pose of the present study was (a) to examine 
the childrearing attitudes held by emotion- 
ally disturbed adolescents, and (b) to see if 
their attitudes approximate those of mothers 
of sick children, which would suggest atti- 
tude perpetuation. 

A 64-item attitude survey was administered 
to 34 male and 32 female adolescents being 
treated in a residential setting. A normal 
residential school group of 34 males and 45 
females was the control. Each item was previ- 
ously classified by a psychiatrist, two psy- 
chologists, and a social worker as reflecting 
(a) restrictive control, (b) ineffectual control, 
(c) excessive devotion, or (d) cool detach- 
ment in parental attitude toward a child. 
Each S was required to strongly agree, mildly 
agree, strongly disagree, or mildly disagree 
with each item. Each item response was 
weighted in such a manner that the experi- 
mental and control groups could be compared 
on each of the four subscales through the ¢ 
test, and could also be compared on each item 
individually through chi square. 

Comparison on the four subscales revealed 
that the emotionally disturbed adolescents of 
both sexes expressed a significantly more re- 
strictive controlling attitude than the con- 
trols (.05 level). No significant differences 
were found on the other three subscales. Sig- 


1An extended report of this study may be ob- 
tained without charge from George Spivack, Dev- 
ereux Schools, Devon, Pennsylvania, or for a fee 
from the American Documentation Institute. Order 
Document No. 5103, remitting $1.25 for microfilm 
or $1.25 for photocopies. 


nificant chi squares (.05 level or better) were 
obtained on 14 items for the males, and 5 
more items approached significance (.10 
level). There were 14 significant items for 
the females, 1 approaching significance. 

The results lend support to the hypothesis 
that childrearing attitudes are perpetuated in 
the case of overcontrolling and restricting atti- 
tudes, but not in the case of attitudes reflect- 
ing ineffectual control, excessive devotion, oF 
cool detachment. The absence of positive re- 
sults in the latter cases suggests that such 
attitudes are not perpetuated in any direct 
sense, or that if perpetuated, their expression 
is too subtle for such a questionnaire to pick 
up. The positive results suggest the impor- 
tance of further exploration into attitude per- 
petuation, particularly the means whereby 
attitudes are handed down to or defined for 
the younger generation. Approaching the re- 
sults in terms of what light they shed on ado- 
lescent adjustment generally, there is indica- 
tion that emotionally disturbed adolescents, 
much more than normal adolescents, feel @ 
strong need for parental or parental-surrogate 
imposition of external controls; they seem tO 
feel a stronger need to conform to what they 
see as parental values and standards of right 
and wrong. These results suggest that the sup- 
posed adolescent rebellion is less character- 
istic of emotionally disturbed adolescents than 
of normal adolescents, and that overt behav- 
ior labeled as rebellion in these children is n0t 
a positive drive for independence and a seart 
for new values, but rather a confused search 
for self-definition and standards of conduct 
to follow. 


Brief Report. 
Received December 21, 1956. 
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Age, Vocabulary, Anxiety, and Brain Damage 
as Factors in Verbal Learning’ 


J. P. S. Robertson 
Netherne Hospital, Coulsdon, England 


There are still many difficulties in deciding 
whether inefficiencies shown by neuropsychi- 
atric patients in tests of memory point to 
brain damage or result from factors such as 
advanced age, poor verbal ability and test 
anxiety (1). This investigation was conducted 
to throw light on the relative importance of 
the latter factors in the verbal learning of pa- 
tients without brain damage and to see, when 
they were taken into account, if patients with 
brain damage gave a less efficient perform- 


ance than others. 


Patients Tested 


The undamaged patients were drawn from 
relieved psychotics awaiting discharge and 
chronic psychotics on parole who were in 
stable hospital employment. A classification 
of suitable patients was made according to 
sex, age, and vocabulary level. The age 
classification was into young (20-39), middle- 
aged (40-59), and old (60-79). The vocabu- 
lary classification was based on the Wechsler- 
Bellevue Vocabulary score as low (under 23) 
and high (23 or above). Ten patients of each 
sex were examined for all combinations of age 


and vocabulary level, 120 undamaged patients 


in all. They were drawn in alphabetical order 


from appropriate wards until the requisite 
numbers were made up. The mean ages were: 
young 29.8, SD 6.7; middle-aged 50.7, SD 
5.6; old 67.3, SD 5.0. The mean vocabulary 


i Freudenberg, 
1 The author is grateful to Dr. R. K. F 7 
Physician superintendent, Netherne Hospital, for fa- 
cilities to conduct this investigation, and to Dr. A. 
Walk, physician superintendent, Cane Hill Hospital, 
Coulsdon, for providing additional brain-damaged 
Patients. 


scores were: low group 16.9, SD 3.6, and 
high group 30.3, SD 4.4. 

The brain-damaged patients were all such 
(excluding arteriosclerotics) who were pres- 
ent in two neuropsychiatric hospitals in July, 
1955, and were able or willing to cooperate. 
They were classified according to age and vo- 
cabulary in the same way as the undamaged 
group. They were also classified in accord- 
ance with their neuropsychiatrist’s opinion as 
showing mild or severe dementia. According to 
age they were: young 4, middle-aged 29, and 
old 26. According to vocabulary they were: 
high 18 and low 41. According to dementia 
they were: mild 38 and severe 21. The diag- 
noses of the mildly demented were: head in- 
jury 1, cerebral tumor 1, cerebral atrophy 7, 
paresis 11, vascular rupture 1, carbon mon- 
oxide poisoning 1, alcoholic dementia 3, 
Huntington’s chorea 2, organic senile de- 
mentia 11. The diagnoses of the severely 
demented were: head injury 1, cerebral 
atrophy 3, paresis 3, vascular rupture 1, 
alcoholic dementia 2, Huntington’s chorea 5, 
organic senile dementia 6. 

In regard to test anxiety all patients were 
classified according to whether or not they 
displayed observed and stated anxiety. They 
were said to show observed anxiety if the 
tester assessed them as anxious on the evi- 
dence of their test behavior. They were said 
to show stated anxiety if they answered posi- 
tively the question: “Did you feel nervous 
while doing these tests or that you must get 
them right ?” The tester’s assessment was re- 
corded prior to asking this question. The pa- 
tients were also asked if they had had any 
reason to complain of memory difficulties in 
the recent past. 
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Tests Employed 


The kinds of verbal learning investigated 

were paired-associate learning and cumulative 
rote learning. Since a common complaint of 
memory difficulties concerns trouble with per- 
sonal names the paired associates were based 
on these: Grocer-Brown, Butcher-Thomson, 
Baker-Williams, Fishmonger-Todd, Chemist- 
Jones, Draper-Cook, Plumber-Smith, News- 
agent-Hunt, Greengrocer-Robinson, Ironmon- 
ger-Lewis. The instructions were: “This is a 
test of your memory for names. There were 
ten shops in a village street. I’m going to tell 
you the names of the shopkeepers and what 
the business of each was. Then I want you to 
tell me the name of each shopkeeper when I 
say his business.” The associates were pre- 
sented auditorily in six cycles, the order of 
each cycle being derived first from randomiza- 
tion and then rearrangement so that in the 
total no two names were in immediate suc- 
cession more than once. All associates were 
said before each cycle in the order of that 
cycle. The administration was in the form: 
“The grocer’s name was .. .” The patient 
was allowed up to 15 seconds to answer, then 
prompted. One mark was given for each cor- 
rect answer so that the maximum was 60: 
the obtained range was 0-57. A score was 
also kept of all names offered which were en- 
tirely outside the list (e.g., Murphy); these 
were termed paramnesias. A parallel test 
based on women’s names was administered to 
81 patients in the undamaged group at six 
weeks after the first testing. 

The cumulative rote-memory test was ad- 
ministered in the style of “This was the House 
that Jack Built.” Five lists were used, each 
comprising 15 names of the same class of ob- 
ject: (a) animals, (b) body parts, (c) ob- 
jects connected with eating, (d) fruits, (e) 
tools. The patient was presented auditorily 
with one item, two items, three items, etc., 
without variation of order until he made an 
error, when the next list was begun. The 
score was the grand total attained without 
error, the maximum being 75 and the ob- 
tained range 15-42. A parallel set of lists 
was administered to 81 patients six weeks 
after the first testing. 

At the end of the first session each pa- 
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tient was asked whether he had made use of 
mnemonics, visual images, or other devices in 
learning. 


Relationship of Tests 


The paired-associate and rote-memory tests 
at the first session had a product-moment cor- 
relation of .32. The parallel versions of paired 
associates correlated .72 and those of rote 
memory .78. The split-half correlation of the 
first paired-associate test (with Spearman- 
Brown correction) was .95 for odd and even 
cycles, and .80 for items which were odd and 
even in the order of the first cycle. The first 
and third lists of the rote-memory test cor- 
related with the second and fourth .77. In 
both paired-associate tests the items differed 
very significantly in their degree of difficulty 
but the factors governing this were complex. 
There was no significant relation to the fre- 
quency of the personal names in the commu- 
nity. The item presented first in the first cy- 
cle had an advantage in each test, Difficulty 
of the rote lists varied according to knowledge 
and interest. 

Observed and stated test anxiety showed a 
fairly close correspondence. Their uncorrected 
phi coefficient was .75. Complaints of mem- 
ory difficulties showed little correspondence 
with anxiety. The phi coefficient with ob- 
served anxiety was .02 and with stated anx- 
lety .12. 


Sex, Age, and Vocabulary in Undamaged 
Patients 


The relative importance of sex, age, and 
vocabulary level in the performance of the 
undamaged patients on the first versions of 
the paired-associate and cumulative rote tests 
and also in regard to paramnesias was detet- 
mined by analysis of variance. In paired-aS- 
Sociate learning sex differences were not Sig- 
nificant, age differences were significant at 
the 5% level, and differences according t° 
vocabulary significant at the 1% level. The 
old had lower scores than both young an 
middle-aged, but the latter did not differ. The 
low vocabulary patients had markedly lower 
scores than the high vocabulary ones. 12 
cumulative rote learning sex was not signifi- 
cant, age was significant at the 5% level an 
vocabulary significant at the 1% level. Here 


Factors in Verbal Learning 


the middle-aged and old had lower scorés than 
the young but did not differ from each other. 
The difference for vocabulary level was as 
marked as in paired-associate learning. In 
paramnesias sex was significant at the 5% 
level, age was not significant, and vocabulary 
was significant at the 5% level. They were 
commoner in males and the low vocabulary 
patients. 


Anxiety and Other Factors in 
Undamaged Patients 


The significance of differences in the occur- 
rence of anxiety, memory difficulties, and in- 
terpolated aids was determined by chi square 
or Fisher’s exact method. The effect of these 
factors on performance was assessed by ¢ 
tests. Observed anxiety was commoner in 
young than old to an extent approaching 
significance. Stated anxiety was significantly 
commoner in young than old at the 1% level 
and in young than middle-aged to an extent 
approaching significance. When comparison 
was made within the separate age groups, 
neither observed nor stated anxiety had any 
significant relationship to efficiency of per- 
formance in paired-associate or rote learning 
nor to the occurrence of paramnesias. Com- 
plaints of memory difficulties had no signifi- 
cant relation to sex, age, Or vocabulary nor to 
efficiency of performance in the tests. 

Certain patients showed striking differences 
in score on the two versions of paired-asso- 
ciate learning. To a much smaller extent this 
also occurred in rote learning. Comparison 
was made between those whose change was 
more than one SD of the distribution of 
Changes and those where it was not, in re- 
gard to sex, age, vocabulary, anxiety, and 
Memory difficulties. In paired associates the 
only significant difference was in regard to 
age. Fluctuations of score were commoner 1n 
the middle-aged than either young OF old, 


which does not seem meaningful. In rote 


Memory there were no significant differences. 


Systematic use of interpolated aids was 
made by seven patients, partial use by 32 pa- 
tients, and the remainder relied on direct 
recollection. The use of aids had no signifi- 
cant relation to sex, age, OF vocabulary, and 
did not significantly improve efficiency. 


181 


Effects of Brain Damage 


The tests were administered to the brain 
damaged on one occasion only. Comparisons 
were made by ¢ tests within the brain-dam- 
aged group and between it and the undam- 
aged one. The young brain-damaged patients 
were ignored in age comparisons but included 
in the others. 

The middle-aged and old brain-damaged 
patients did not differ significantly on paired 
associates but the middle-aged were signifi- 
cantly better at the 5% level on rote mem- 
ory. The high vocabulary brain-damaged pa- 
tients were significantly better at the 1% 
level than the low vocabulary ones on both 
paired-associate and rote learning. As in the 
undamaged patients paramnesias were signifi- 
cantly commoner among males than females, 
but showed no difference according to age or 
vocabulary. The mildly demented were sig- 
nificantly better at the 5% level than the 
severely demented on rote memory but did 
not differ significantly on paired associates or 
paramnesias. There were no significant dif- 
ferences within the brain-damaged group ac- 
cording to observed or stated anxiety or 
memory difficulties. 

The brain damaged were compared with the 
undamaged in the four subclasses combining 
middle or old age with high or low vocabu- 
lary. They were significantly poorer in each 
subclass on paired associates. They were also 
significantly poorer on rote learning among 
the old low vocabulary patients and almost 
so among the old high vocabulary patients 
but did not differ among the middle-aged. 
The brain-damaged did not differ significantly 
from the undamaged in regard to paramnesias 
or observed and stated test anxiety. Signifi- 
cantly fewer brain-damaged patients at the 
1% level complained of memory difficulties. 
Certain brain-damaged patients gave a rela- 
tively good learning performance, i.e., were 
one SD or more above the mean of the un- 
damaged patients in the same age and vo- 
cabulary subclass. This happened more often 
with rote memory than paired associates. In 
the former, but not the latter, it happened 
more often in the mildly demented. There 
was no relation apparent in this to neuro- 


psychiatric diagnosis. x 
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Discussion 


Complaints of memory difficulties in them- 
selves would appear to be an unsatisfactory 
pointer to brain damage since the undam- 
aged more frequently make them. In verbal 
learning of the paired associate and cumula- 
tive rote type the efficiency of both undam- 
aged and damaged patients corresponds most 
closely to vocabulary level: Age is also a fac- 
tor of some importance, but anxiety in the 
test situation appears to be a negligible in- 
fluence. In testing for verbal memory defects, 
therefore, it would appear necessary and suffi- 
cient to allow for age and verbal ability. The 
marked fluctuation of efficiency from one test- 
ing to another in undamaged patients on 
paired-associate learning suggests that tests 
should be applied more than once before 
brain damage is inferred. There can be little 
doubt that, when age and vocabulary level 
are allowed for, brain-damaged patients are 
inferior to other patients on paired-associate 
learning. The existence of a few exceptions 
invites further enquiry. The position in cumu- 
lative rote learning is less clear but it would 
seem that brain damage lowers scores only 
when it exacerbates the effects of advanced 
age. The results on anxiety are discordant 
with a considerable body of work on the re- 
lation between manifest anxiety and learning 
in students (2, 3). This may depend either 
on the method of assessing anxiety or on the 
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population investigated. The method of as- 
sessing anxiety used here, however, seems to 
reflect closely the realities of the immediate 
situation in which memory is tested. 


Summary 


Factors influencing efficiency in paired-as- 
sociate and rote verbal learning were investi- 
gated in relation to 59 brain-damaged and 
120 other neuropsychiatric patients. Vocabu- 
lary level and age were significant influences, 
but test anxiety was negligible. When vocabu- 
lary and age were allowed for, the brain 
damaged were significantly less efficient than 
the undamaged on paired-associate learning, 
but the position on rote learning was less 
clear. Some undamaged patients showed strik- 
ing fluctuations in paired-associate learning, 
when tested a second time. 


Received June 8, 1956. 
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The Stability of the Social Desirability Scale Values 
in the Edwards Personal Preference Schedule’ 


C. James Klett* 


University of Washington 


The Edwards Personal Preference Schedule 
(EPPS) (5) makes use of a forced-choice 
technique in which items pertaining to differ- 
ing psychological needs but having compa- 
rable social desirability scale values are paired 
in an attempt to minimize the subjects’ natu- 
ral tendency to respond in the socially ap- 
proved direction. In developing the EPPS, 
Edwards (4) scaled 140 items relating to 
14 relatively independent, normal personality 
variables? drawn from a list of manifest 
needs described by Murray (9). 

In the course of establishing high school 
norms for the EPPS (7, 8), the question was 
raised as to whether the pairs of items in the 
EPPS were equally well matched for social 
desirability when presented to groups other 
than the college population upon which the 
test was developed and standardized. Accord- 
ingly, a group of 206 high school students 
from Lincoln High School in Tacoma, Wash- 
ington was selected, and the items were re- 
Scaled for social desirability. Six items relat- 
ing to heterosexuality were omitted as being 
unsuitable for verbal administration to large 
groups of high school students, and the re- 
maining 134 items were rescaled in the same 
manner as that utilized by Edwards (4). 

Prior to the scaling, the judges were asked 


1 This study was part of a doctoral dissertation 
Completed at the University of Washington, 1956. 
Thanks are extended to Dr. Allen L. Edwards and 
to the public school officials of the Tacoma Public 
School System for their valuable assistance. 

2Now at VA Hospital, Northampton, Massachu- 
setts. 


8 The fifteenth need (abasement) was not scaled 


in the same manner, the scale values being esti- 
mated by means of the regression of probability of 
endorsement on social desirability scale values (5). 


to record their age, grade, sex, and a descrip- 
tion of their fathers’ occupation on a special 
blank. From the description of the fathers’ 
occupation, a socioeconomic status (SES) 
classification was made for each judge, based 
upon the SES occupational tables developed 
by the U. S. Bureau of the Census (2), which 
provide for six categories ranging from pro- 
fessional to unskilled worker. This classifica- 
tion was made by two psychologists working 
independently, and disagreements were dis- 
cussed and reconciled. 

Analyses of the judgments about the social 
desirability of each item were made sepa- 
rately by sex, grade, and SES. The median 
interval values of the items for the group of 
boys was plotted against the comparable 
values for the girls, and similar plots were 
made by grade and by SES. It was found that 
neither sex, grade, nor SES produced any 
essential differences in the intervals in which 
the median judgments for the items were 
found. The separate distributions were then 
combined to form a single distribution of 
judgments from which the social desirability 
scale values of the items were obtained by 
means of the method of successive intervals 
(3). These scale values correlated .94 with 
those derived by Edwards (4). 

In order to study the goodness of fit of the 
items which were paired in the EPPS for so- 
cial desirability, a comparison of the scale 
value for each item over all pairs was made 
by means of Fisher’s intraclass correlation 
and found to be .69. Edwards reported a simi- 
lar correlation of .85. Secondly, a comparison 
was made of the distribution of the absolute 
differences between the scale values for item 
pairs with the corresponding distribution ob- 
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tained by Edwards. A test of the differences 
between the means of these two distributions 
yielded a ¢ of 3.51, significant beyond the 1% 
level. The mean of the absolute differences de- 
rived from the new scale values was .452, 
while that derived from Edwards’ scale values 
was .314, 

Finally, to ascertain the relationship of the 
new scale values and actual test performance, 
100 EPPS protocols were drawn at random 
from the high school normative sample (7, 
8), and the proportion of endorsements of 
the first item in each pair was computed. The 
correlation between the proportion of endorse- 
ment and the difference in scale values for the 
items in each pair was found to be .51. 


Discussion 


In the course of deriving his scale values 
for social desirability, Edwards (4) found no 
essential relationships between the interval in 
which the median value of the social desir- 
ability judgments fell and sex, age, or edu- 
cation. The present study shows no very cru- 
cial differences between the sexes or grades, 
Such differences between groups as were found 
seemed to be no greater or more frequent 
than could be expected by chance when study- 
ing as many as 134 pairs of median values. 
In plotting the data for the sexes, for exam- 
ple, only four pairs of median values differed 
by two intervals from the principal diagonal. 
An inspection of the raw data revealed that 
the median values fell near the limits of the 
interval in most cases, so the apparent dif- 
ferences in median judgments would be re- 
duced to somewhat less than an interval of 
two. 

A more unexpected finding was the lack of 
relationship between the median interval value 
as judged by differing socioeconomic groups. 
It would appear, from these findings, that 
subjects tended to judge social desirability in 
others on the basis of a common stereotype 
unrelated to grade, sex, or differential social 

class membership. 

The correlation (.94) ọf the scale values 
with those of the college group (5) demon- 
strated a remarkable stability in their rela- 
tive size. Fujita (6) obtained a comparable 
correlation with Edwards scale values using 
a group of college Nisei as judges, and 
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Lovaas,‘ administering a translated version 
of the items to a group of students in Nor- 
way, obtained a correlation of .78 between 
his scale values and those of Edwards. The 
notion that such a pervasive and stable social 
desirability stereotype exists is a provocative 
one and has possible implications for social 
psychology as well as test theory. What con- 
stitutes socially approved behavior and the 
readiness with which individuals accept the 
stereotype may be utilized as dimensions in 
studying cultural differences. Whether the 
social desirability stereotype is sensitive to 
psychopathology is another avenue of applica- 
tion. The finding that the judged social de- 
sirability of an item did not vary among the 
subgroups mentioned would seem to minimize 
the contribution of this particular variable to 
personality test score differences, either be- 
tween socioeconomic groups or between vary- 
ing samples of the population. This has some 
relevance for test theory, because, if the con- 
cept of social desirability varied markedly, it 
would be difficult, if not impossible, in stud- 
ies such as reviewed by Auld (1), to deter- 
mine whether personality variables or social 
desirability variables account for the observed 
test differences. 

Since the format of the EPPS is constant, 
i.e., the items which appear in the pairs have 
been set, differences in social desirability 
scale values could be expected to change the 
probability of endorsement of either item in 
a particular pair. The goodness of fit of the 
paired items in terms of social desirability 
was evaluated by means of two statistics. The 
intraclass correlation between the scale values 
of item pairs and the ¢ test between the ab- 
solute means of the scale value differences 
within item pairs both indicated that Ed- 
wards’ scale values were more successful in 
matching the items in pairs than the present 
scale values. As Edwards was able to ma- 
nipulate freely the items forming the pairs 
until he obtained the best possible fit, such 
a finding is not surprising. It was expecte 
that, with different scale values and predeter- 
mined pairs, thé goodness of fit would only 
approximate that of the original fit. An Imi 
plication which this has for the forced-choice 


t Ivar Lovaas; personal communication, June t 
1956. 


Stability of Social Desirability in the Edwards PPS 


format, however, is that, even with a correla- 
tion coefficient of .94 between the new and 
the original scale values, the goodness of fit 
of paired items can be significantly altered. 

A test of the effect that goodness of fit has 
on forced-choice performance was provided 
by a comparison of the proportion of en- 
dorsement of item A of each pair with the 
difference in scale values of the pair. Ed- 
wards (5) was able to reduce the correlation 
between proportion of response and social de- 
sirability scale values of the item from .87 in 
a true-false format to .40 when the items were 
paired for social desirability. The new scale 
values correlated .51. This coefficient is not 
so low as that reported by Edwards, although 
the difference between the two correlations is 
not significant. 


Summary and Conclusions 


The Edwards Personal Preference Schedule 
(EPPS) (5) is so constructed as to pair items 
representing different psychological needs in 
terms of their social desirability scale values. 
Since the determination of the original scale 
values had been made on a college group, the 
present study was designed to determine how 
similar scale values obtained from a high 
school group, and from varying socioeconomic 
Status groups within the high school popula- 
tion, would be to the original scale values ob- 
tained by Edwards (3). Further, an effort 
was made to determine what effect differences 
in scale values would have on the adequacy 
of matching of the pairs of items in the EPPS 
and on the probability of endorsement of the 
items. It was found that: 


1. There were no differe 
economic groups within the ol pop 
lation as to the median value of their social 
desirability judgments on the items. There 
Was no difference between the grades or sexes. 


nces among socio- 
high school popu- 
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2. The social desirability scale values ob- 
tained from the high school group as a whole 
correlated .94 with those obtained by Ed- 
wards (3). 

3. Despite the high correlation between 
the two sets of scale values, tests of the good- 
ness of fit of the matched pairs in the EPPS 
revealed that the scale values as obtained in 
this study resulted in less adequate matching 
and a correspondingly greater relationship be- 
twen social desirability and probability of 
endorsement. z 
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Overinclusive Thinking in a Depressive 
and a Control Group 
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Cameron (1, 2, 3, 4, 5) believes that “over- 
inclusion” is one of the most important as- 
pects of schizophrenic thought disorder. The 
schizophrenic is unable to preserve his con- 
ceptual boundaries, so that irrelevant ideas 
become incorporated into his concepts, mak- 
ing his thinking more abstract and less lucid. 

Epstein (6) has contributed enormously to 
the operational definition of “overinclusion” 
by developing a simple pencil-and-paper 
measure of this aspect of thought disorder. 
The Epstein test presents the subject with 
a list of 50 words. Following each stimulus 
word there are six response words (including 
the word “none”). The subject is merely 
asked to underline all those response words 
which are a necessary part of the concept de- 
noted by the stimulus word. Epstein predicted 
from Cameron’s theory that schizophrenics 
would undeline more response words than 
normals, as their concepts would be more 
overinclusive. This was in fact the case, the 
difference being significant at the .001 level 
of confidence. He also found xo differences 
between normals and schizophrenics in “un- 
derinclusion” (the tendency to underline too 
few response words). As Epstein points out 
however, “Another question that arises is 
whether overinclusion is characteristic only 
of schizophrenic behaviour, or whether it is 
equally prominent in other psychoses” (6). 

It was the purpose of the present study to 
investigate “overinclusion” in depressives. If, 
as Cameron suggests, this form of thought 
disorder is typical of schizophrenics, depres- 
sives should not differ much from normals. 


Procedure 


Tests. The Epstein Overinclusion test and 
the Mill Hill Vocabulary test were given to 
a group of 11 depressives and 14 normal con- 
trols. The Epstein test was given and scored 
according to Epstein’s procedure.* An “over- 
inclusion score” and an “underinclusion score” 
were then obtained. The Mill Hill Vocabu- 
lary test was also given in the standard way 
(9). Both tests were administered individu- 
ally, and the order of presentation was alter- 
nated. The vocabulary test was included so as 
to match the subjects for vocabulary level. 

Subjects. The patients consisted of 11 de- 
pressives, four males and seven females be- 
tween the ages of 33 and 56. All were inpa- 
tients of the Bethlem Royal or the Maudsley 
Hospital. All had been diagnosed depressive 
by both the registrar, and the consultant in 
charge of the case. All were regarded as rea- 
sonably typical cases of “endogenous” de- 
pression. 

The controls consisted of 14 normal peo- 
ple, chosen so that as a group they were 
closely matched with the patients for 28° 
Sex, occupation, educational level, and V°- 
cabulary level. There were five males aP 
nine females, between the ages of 28 and 56. 
These data are summarized in Table 1. AS 
can be seen, the controls do not differ signifi: 
cantly from the depressives in age oF V°; 
cabulary score. 

+The authors would like to thank Professor Ep- 
stein for making available to the psychology depart- 


ment of the Institute of Psychiatry his test and 8°" 
ing key. 
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Results 


The results are presented in Table 2. As 
can be seen the depressives “overinclude” to 
a highly significantly greater amount than do 
the normal controls, contrary to what one 
might expect from Cameron’s writings. In fact 
the depressives in this sample have a much 
higher mean “overinclusion” score (37.91) 
than did the schizophrenics in Epstein’s study 
(20.92). This difference is probably signifi- 
cant, although it cannot be calculated as Ep- 
stein does not give the variance for his schizo- 
phrenic group. 

It is interesting that the control group in 
the present study obtained a mean “overin- 
clusion” score (13.79) almost identical with 
Epstein’s normals (12.49). As the present 
control group is English it is surprising that 
cultural and other factors seem to make so 
little difference. 

There were no significant differences with 
respect to the underinclusion score. This is 
consistent with Epstein’s finding. 

The present study also considered an addi- 
tional score, the “neologism” score. Five of 
the 50 stimulus words are neologisms, as are 
five of the response words. The neologism 
score was merely the total number of re- 


Table 1 


Characteristics of the Depressive and Normal Samples 


Depres- Normal Significance 

Measure sives controls of difference 
Education: 

elementary 8 11 

secondary 3 3 
Occupation: 

semiskilled 5 6 

skilled 5 6 

managerial 

or professional 1 2 
Age: 

Mean 43.82 4464 t = 0.23 

SD B51 936 p > 0.5 
Mill Hill Vocabu- 

lary, raw score: 
Mean 5291 57.21 t= 114 
SD 994 858 -50> ż> 0.10 
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Table 2 


Overinclusion, Underinclusion, and Neologisms in 
Depressive and Normal Subjects 


Depres- Normal Significance 


Measure sives controls of difference 
Over- Mean 37.91 13.79 = 3.35 
inclusion SD 22.87 7.66 p < 0.001 
Range 1fto92 4to30 
Under- Mean 12.64 11.14 ¿= 0.77 
inclusion SD 5.02 4.50 0.50>p>0.10 
Range Sto21 6to19 
Neologism Mean 3.27 1.93 t= 1.47 
SD 2.37 2.12 0.50>p>0.10 
Range Oto6 Oto8 


sponses (other than “none”) underlined in 
response to a neologism, plus the number of 
neologisms underlined as responses. The de- 
pressives had slightly higher scores, but the 
difference did not reach significance. This sug- 
gests that the tendency to respond to neolo- 
gisms cannot explain the differences in “over- 
inclusion.” 


Summary and Conclusions 


The present results suggest that depressives 
“overinclude” significantly more than normals 
on Epstein’s test. In fact depressives are prob- 
ably more abnormal with respect to “overin- 
clusion” of thinking than are schizophrenics. 
This is inconsistent with Cameron’s theory, 
as he appears to regard this type of thought 
disorder as specific to schizophrenics. Two ex- 
planations could account for these results: 

1. It is possible that “overinclusion” is re- 
lated to “psychoticism” rather than to schizo- 
phrenia specifically. It is also possible that 
the depressives in the present study were 
more “psychotic” than the schizophrenics in 
Epstein’s study. Similar results have been re- 
ported by Eysenck (7, 8) if “psychoticism” 
is defined operationally in terms of a “factor.” 

2. It is possible on the other hand that 
“overinclusion” is merely related to the spe- 
cific symptom of depression. Schizophrenics 
as a group are probably more depressed than 
are normals, but not as depressed as depres- 
sive patients. 


Received June 25, 1956. 
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A compendium of results of psychotherapy 
with adults was published a few years ago by 
Eysenck (16). It included reports from 24 
sources on more than 8,000 cases treated by an 
assortment of psychotherapeutic techniques. 
The average percentage of cases reported as im- 
proved (i.e., cured, improved, much improved, 
adjusted, well, etc.) is about 65.1 Eysenck’s 
control or baseline data estimating the remis- 
sion rate in the absence of formal psycho- 
therapy come from two sources. Those of 
Landis (32) for hospitalized neurotics, and 
those of Denker (14) for neurotics treated at 
home by general practitioners, show similar 
remission rates of about 70% for a 2-year pe- 
riod. Comparing these figures with the aver- 
age for the treated cases, Eysenck concluded, 
“o. roughly two-thirds of a group of neu- 
rotic patients will recover oF improve to a 
marked extent within about two years of the 
onset of their illness, whether they are treated 
by means of psychotherapy O" nok (16, p. 
322). He concludes further that the figures 
fail to support the hypothesis that an 
therapy facilitates recovery from neurotic dis- 
Order” (16, p- 323). jag ai PE E 


The difficulties attending @ ) 
Psychotherapy have been detailed many times, 
most recently by Rosenzweig (47) in a 

ot quite as “remarkably 
ioi to T aother” as Eysenck 
9 reports of the results of 
tly among themselves 


1 The data, however, @ 
Stable from one investiga 
appears to believe. The D p 
eclectic therapy differ significa a i 
Ment ar ared. A chi squa - 

b beyond aa level for 18 degrees of nee 

Ysenck’s point is nonetheless basically reaso A 


© range of per cent iPro yey when one des 
re i tability W 
T considerati tion, chronology, treatment, 


© differences in pop y among the studies. 


critique of Eysenck’s findings. Other thought- 
ful and well-organized delineations of evalua- 
tion problems include those of Thorne (50), 
Zubin (56, 57), and Greenhill (22), among 
others. It is not within the province of the 
present paper to repeat these accounts. 

The purpose of this paper is to summarize 
available reports of the results of psychother- 
apy with children using Eysenck’s article (16) 
as a model.? Certain departures will be neces- 
sitated by the nature of the data, but in the 
main, the form will follow that of Eysenck. 


Baseline and Unit of Measurement 


As in Eysenck’s study, the “unit of meas- 
urement” used here will be evaluations of the 
degree of improvement of the patient by con- 
cerned clinicians. Individuals listed as “much 
improved, improved, partially improved, suc- 
cessful, partially successful, adjusted, partially 
adjusted, satisfactory,” etc., will be grouped 
under the general heading of Improved. The 
Unimproved cases were found in groupings 
like “slightly improved, unimproved, unad- 
justed, failure, worse,” etc. 

The use of the discharge rate of children’s 
wards in state hospitals as a baseline for 
evaluating the effects of psychotherapy is not 
recommended. It is most likely that hospital- 
ized children are initially more disturbed than 


2 Compendia similar to, and overlapping Eysenck’s 
have been published by Zubin (57) and by Miles, 
Barrabee, and Finesinger (39). These tend to be 
more detailed and descriptive. Eysenck’s work is 
most concise; in it, descriptions and discussions of 
individual studies have been subordinated to the 
presentation of overall results. The present writer 
feels that this is the most provocative, and hence 
most fruitful, way of evaluating a collection of psy- 
chotherapeutic results. 
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those brought to the child guidance clinics 
and family service agencies from which the 
data on treatment are drawn. Few guidance 
clinics or family service agencies accept psy- 
chotic children for treatment, tending instead 
to refer them to the state hospital. Further- 
more, as Rosenzweig (47) points out, the cri- 
teria for discharge from a state hospital are 
probably less stringent than those leading to 
an appraisal of Improved by other agencies. 
For these reasons, available statistics of state 
hospital populations such as those of Witmer 
(52), McFie (38), and Robins and O’Neal 
(46) are not used as baseline data. 

Follow-up evaluations of changes in behav- 
ior problems in normal children also do not 
furnish satisfactory control data. Studies such 
as those of McFie (38) and Cummings (12) 
report markedly conflicting results, probably 
as a function of differences in ages of the sub- 
jects, and of varying follow-up intervals. More 
importantly, behavior like nail biting and nose 
picking can hardly be regarded as comparable 
to the problems for which children are re- 
ferred to guidance clinics. 

The use of a follow-up control group of 
cases closed as unsuccessful, as in the study 
of Shirley, Baum, and Polsky (49), suffers 
from obvious weaknesses. Such a group is not 
comparable to an untreated sample; it ap- 
pears to represent the segment of the treat- 
ment population for which a poor prognosis 
has been already established. 

A common phenomenon of the child guid- 
ance clinic is the patient who is accepted for 
treatment, but who voluntarily breaks off the 
clinic relationship without ever being treated. 
In institutions where the service load is heavy 
and the waiting period between acceptance 
and onset of treatment may range up to 6 
months, this group of patients is often quite 
large. Theoretically, they have the charac- 
teristics of an adequate control group. So far 
as is known, they are similar to treated groups 

in every respect except for the factor of treat- 
ment itself. 

Nevertheless, the use of this type of group 
as a control is not common in follow-up evalu- 
ations of the efficacy of treatment. Three stud- 
ies report follow-up data on such groups. Of 
these, the data of Morris and Soroker (40) 
are not suitable for the purposes of this paper. 
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Of their.72 cases, at least 11 had treatment 
elsewhere between the last formal contact with 
the clinic and the point of evaluation, while 
an indeterminate number had problems too 
minor to warrant clinic treatment. 

The samples in the remaining two studies 
appear satisfactory as sources of baseline data. 
Witmer and Keller (55) appraised their group 
8 to 13 years after clinic treatment, and re- 
ported that 78% were Improved. In the Lehr- 
man study (34), a one-year follow-up interval 
found 70% Improved. The overall rate of im- 
provement for 160 cases in both reports is 
72.5%. This figure will be used as the base- 
line for evaluating the results of treatment of 
children. 


The Results of Psychotherapy 


Studies showing outcome at close of treat- 
ment are not distinguished from follow-up 
studies in Eysenck’s aggregation. The distinc- 
tion seems logical, and is also meaningful in 
the predictive sense, as the analyses of this 
paper will indicate. Of the reports providing 
data for the present evaluation, thirteen pre- 
sent data at close, twelve give follow-up re- 
sults, and five furnish both types, making a 
total of eighteen evaluations at close and 
seventeen at follow-up. The data of two re- 
ports (29, 30) are based on a combined close- 
follow-up rating. Results for the three kinds 
of evaluations will be presented separately. 

The age range covered by all studies is from 
preschool to 21 years at the time of original 
clinic contact, the customary juncture for the 
determination of age for the descriptive data. 
However, very few patients were over 18 years 
at that time, and not many were over 17. 
The median age, roughly estimated from the 
ranges, would be about 10 years. 

The usual psychiatric classification of men- 
tal illnesses is not always appropriate for 
childhood disorders. The writer has attempte 
to include only cases which would crudely be 
termed neuroses, by eliminating the data 0? 
delinquents, mental defectives, and psychotics 
whenever possible. The latter two groups COD” 
stituted a very small proportion of the clinic 
cases. The proportion of delinquent cases is 
also small at some clinics but fairly large at 
others. Since the data as presented were no 
always amenable to these excisions, 47 puke 
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known number of delinquent cases are in- 
cluded. However, the outcomes for the sepa- 
rated delinquents are much the same as those 
for the entire included group. 

As in Eysenck’s study, a number of reports 
were excluded here for various reasons. The 
investigations of Healy and Bronner (24), 
Feiker (18), Ellis (15), Mann (37), and 
Giddings (20) were eliminated because of 
overlap, partial overlap, or suspected overlap 
of the sample with samples of included re- 
ports. Those of Bennett and Rogers (3), Rich 
(45), Hunt, Blenkner, and Kogan (27), 
Schiffmann and Olson (48), and Heckman 
and Stone (25) were not useable either be- 
cause of peculiar or inadequate presentation 
of data, or because results for children and 
adults were inseparable. ; 

The number of categories in which patients 
were classified varied from study to study. 
Most used either a three-, four- or five-point 


scale. A few used only two categories, while 
one had twelve. Classification systems with 
ed into 


more than five points were Compress 
smaller scales. The data are presented tabu- 
larly in their original form, but the totals are 
Pooled into three categories, Much Improved, 
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Partially Improved, and Unimproved. A sum- 
mation of the former two categories gives the 
frequency of Improved Cases. 

A summary of results at close is shown in 
Table 1. Results of follow-up evaluations are 
summarized in Table 2, while the results from 
two studies using a combined close-follow-up 
evaluation are presented in Table 3. In the 
latter two tables, the follow-up interval is 
given as a range of years, the usual form of 
presentation in the studies. An attempt has 
been made to compute an average interval 
per case, using the midpoint of the range as 
a median when necessary. These averages are 
tenuous since it cannot be safely assumed 
that the midpoint actually is the median 
value. For example, in the Healy-Bronner in- 
vestigation (23), the range of intervals is 1 
to 20 years, but the median is given as 24 
years. Since the proportion of cases which can 
be located is likely to vary inversely with the 
number of years of last clinic contact, the 
averages of 4.8 years for the follow-up studies 
and 2.3 years for the close-follow-up studies 
are probably overestimates. 

Table 1 shows that the average percentage 
of improvement, i.e., the combined percent- 


Table 1 
Summary of Results of Psychotherapy with Children At Close 
Much Partially 5 Per cent 
N Improved Improved Unimproved improved 
Study 
16 18; 42 ge B 80.7 
(11) 57 13 18 42 2% 1 73.0 
(26) 100 12 29 10 10 85.7 
(28) a 54 $2 46 68 72.8 
eee 196 76 52 s 65.3 
(34) 50 15 18 1 66.0 
(31) 26 25 54 47 62.7 
(10) 1 75 154 61 79.0 
(53) 290 207 398 209 743 
(2) a 26 31 i 792 
(43) 6 93 61 i 
(33) 2 5 11 11 59.3. 
(6) A 13 8 10 67.7 
(9) 2 9 12 47.8 
(8) z 35 22 18 76.0 
(7) 1 31 21 28 65.0 
(1) A 225 297 43.1 
(35) ra 251 169 59.8 
= =a 1,174 1,105 1,120 67.05 
All cases ga 34.54 32.51 32.95 
Per cent 100.0 
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Table 2 
Summary of Results of Psychotherapy with Children at Follow-up 


Interval Much Partially Per cent 
Study in years N Improved Improved Unimproved improved i 
(33) 1-5 197 49 55 39 38 16 72.6 
(5) 2 33 8 11 7 6 1 78.8 
(11) 2-3 =. af 25 17 6 6 3 84.2 
(52)8 1-10 366 81 78 106 101 24 Pi 
(28) 2-3 70 21 30 13 6 91.4 ' 
(51) 5-8 17 7 3 4 3 58.8 
(34) 1 196 99 46 51 74.0 
(41) 16-27 34 22 11 1 97.1 
(2) 1-20 705 358 225 122 82.7 
(4) 5-18 650 355 181 114 82.5 
(36) 3-15 484 111 264 109 77.5 a 
(19) 1-4 732 179 398 155 78.8 
(13) 5 359 228 80 51 85.8 
(21) 1-2 25 6 12 7 72.0 
(42) 1-2 25 10 6 9 64.0 
(35) th 191 82 109 42.9 
(23) 1-20 78 71 7 91.0 
All cases 4,8» 4,219 1,712 1,588 919 78.22 
Per cent 100,00 40.58 37.64 21.78 


a Data based on 13 studies originally reported in (54); results of 8 of these are included here. 


b Estimated average follow-up interval per case. 


ages in the Much Improved and Partially Im- 
proved categories is 67.05 at close. It is not 
quite accurate to say that the data are con- 
sistent from study to study. A chi-square 
analysis of improvement and unimprovement 
vields a value of 230.37, which is significant 
beyond the .001 level for 17 df. However, as 
in the case of Eysenck’s data, there is a con- 
siderable amount of consistency considering 
the interstudy differences in methodology, 
definition, etc. 

The average percentage of improvement in 
the follow-up studies is given in Table 2 as 


78.22. The percentage for the combined close- 4 
follow-up evaluations is 73.98, roughly be- 
tween the other two. The percentage of im- 
provement in the control studies was 72.5; 
slightly higher than the improvement at close 
and slightly lower than at follow-up. It would 
appear that treated children are no better off 
at close than untreated children, but that 
they continue to improve over the years and 
eventually surpass the untreated group. 

This conclusion is probably specious, Pet od 
haps unfortunately. One of the two control 
studies was an evaluation one year after the 


Table 3 


Summary of Results of Psychotherapy with C 


hildren Based on Combined Close—Follow-up Evaluation f 
i ; silos a 


Interval Much Partially Per cent - 
Study in years N Improved Improved Unimproved improved 
(29) 1-10 330 94 31 76 a d 74.04 
(30) 1-10 30 9 13 8 73.33 
rae Se, weet 
All cases 5.50 369 103 170 m 73.98 A 
Per cent 100.00 27.91 46.07 26.02 


a Estimated average follow-up interval perlcase. 


EDDIE: ZF TTT 


Results of Psychotherapy with Children 


last clinic contact, the other 8 to 13 years . 
» after. The former study reports only 70% 


improvement while the longer interval pro- 
vided 78% improvement. The figure for the 
one-year interval is similar to the results at 
close, while the percentage of improvement 
for the control with the 8- to 13-year interval 
is almost identical with that for the follow-up 
studies, 
The point of the analysis is more easily 
seen if the results at close and at follow-up 
are pooled. This combination gives the same 
sort of estimate as that furnished by the two 
Control groups pooled since one of them is a 
long-interval follow-up while the other was 
examined only a short time after clinic con- 
tact. The pooled percentage of improvement 
based on 7,987 cases in both close and fol- 
low-up studies is 73.27, which is practically 
the same as the percentage of 72.5 for the 
Controls. ” ii 
It now appears that Eysenck’s conclusion 
Concerning the data for adult ep apean A 
is applicable to children as well; the results 
do not support the hypothesis that recovery 
from neurotic disorder is facilitated by psy- 
cl 3 
e E ma between results at close 
and at follow-up suggests that time is a rg 
tor in improvement. Denker’s report (14) 
also indicated the operation of a time factor. 
He found that 45% of the patients hag re- 
covered by the end of one year, 72 EF — 
Covered by the end of two pe oe ž 
three years, 87% by four years, an a = : 
five years. The rate of improveme Pact 
function of time in Denker’s data 1s y 
Negatively area, 
A Spearman rank-or r : 
estimated median follow-up ner r 
Centa improvement in À d : 
Table 2 3 48, p = .05. This catiinakg oi = 
tionship should be viewed gs me ah i 
Cause of the aforementioned a te iy a 
termining median intervals. ee tan 
cumesied for tied enim T Te is also, of 
oake it a conservative n pe the bivariate 
Course, insensitive to the curve 0 


\Stribution. Meas 
The percentage of nae’ be ai n 
tion of time interval is shown by ee 
Table 4. The studies have been g 


er correlation between 
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five time-interval points in the table. There 
are four studies with estimated median inter- 
vals of 1-14 years, six with intervals of 2-24 
years, three with 5-63 years, two with 10 
years, and two with 12 years. 

The data of Table 4 indicate that most 
of the correlation between improvement and 
time-interval is accounted for by the studies 
with the shortest intérvals, and those with the 
largest. The curve is more or less the same as 
that of Denker’s data, negatively accelerating 
with most of the improvement accomplished 
by 24 years. It is peculiar that the improve- 
ment after 14 years is about 60%, less than 
the 67% improvement at close. However, the 
difference is not too great to attribute to vari- 
ations in methodology and sampling among 
the concerned studies. Another potential ex- 
planation will be offered shortly. 

This analysis suggests that improvement is 
in part a function of time, though the mecha- 
nisms involved remain purely speculative. Fu- 
ture comparisons of the results of psycho- 
therapy should properly take this factor into 
consideration. 

Inspection of the data in Table 1 discloses 
another potential factor in the improvement 
rate. The studies in which only two rating 
categories, improved and unimproved, have 
been used, appear to furnish lower percent- 
ages of improvement than the average. In the 
two reports of this kind in Table 1, the av- 
erage improvement is only 50.5% compared 
with the overall 67%. A complete analysis of 
percentage of improvement as a function of 


number of categories is shown in Table 5. 


Table 4 


Improvement as a Function of the Interval Between 
Last Clinic Contact and Follow-up 


Estimated 

median Number 

interval of Total N Per cent 

in years reports N improved improved 
1-13 4 437 261 59.73 
2 6 1,167 929 79.61 
5 3 742 583 78.57 
10 2 1,189 958 80.57 
12 2 684 569 83.19 
Allcases 17 4,219 3,300 78.22 
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Table 5 


Improvement as a Function of the Number of Points 
on the Rating Scale in Evaluation at Close 


Number’ Number 

of of Total N Per cent 
points reports N improved improved 

2 2 942 | 476 50.53 

3 12 1,980 1,442 72.83 

4 2 320 242 75.63 

5 2 157 119 75.80 

All çases 18 3,399 2,279 67.05 


Examination of Table 5 indicates that three-, 
four- and five-point rating scales produce 
about the same percentage of improvement. 
The use of a two-point scale, however, re- 
sults in over 20% less improvement than the 
others.* This kind of analysis cannot be ap- 
plied to the data in Table 2 since it will be 
confounded by the time factor. 

Evidently, a certain proportion of the un- 
improved cases in the studies using two cate- 
gories would have fallen in partially improved 
categories if they had been utilized. A number 
of cases in which a fair amount of improve- 
ment was manifested are forced into the un- 
improved category when central points are not 
available. A two-point scale thus seems to be 
overly coarse. It is desirable that finer scales 
be used in future evaluation studies. 

The study of Maas eż al. (35), which fur- 
nishes three-quarters of the cases in the 1-14 
year interval group in Table 4, used a two- 
point scale. The percentage of improvement 
is only 43, which may account for the fact 
that this time-interval group has a lower per- 
centage of improvement than in the studies 
at close. 

There are a number of different kinds of 
therapies which have been used in the stud- 
ies reported here. The therapists have been 

psychiatrists, social workers, and teams of cli- 
nicians operating at different points in the 


“The marked ditference between the two-point 
scale studies and those using finer scales is reflected 
in the consistency analysis. The chi square for 17 df 
was 230.37, but when the two-category studies are 
eliminated, it falls to 52.66 for 15 df. The value is 
significant beyond the .01 level, but the original chi 
square has been decreased by more than 75% with a 
loss of only two dj. 
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patient’s milieu. Therapeutic approaches in- 
cluded counseling, guidance, placement, and ., 
recommendations to schools and parents, as 
well as deeper level therapies. In some in- 
stances the patient alone was the focus of at- 
tention. In others, parents and siblings were 
also treated. The studies apparently encom- 
passed a variety of theoretical viewpoints, al- 
though these are not usually specified. Viewed.) 
as a body, the studies providing the data for 
Tables 1, 2, and 3 are therapeutically eclectic, | 
a plurality, perhaps, reflecting psychoanalytic 
approaches. 

Thus we may say that the therapeutic eclec- _ 
ticism, the number of subjects, the results, ~S 
and the conclusions of this paper are mark- 
edly similar to those of Eysenck’s study. Two- 
thirds of the patients examined at close and 
about three-quarters seen in follow-up have 
improved. Approximately the same percent- | 

¢ 


ages of improvement are found for compa- 
rable groups of untreated children. 
As Eysenck pointed out (17) in a sequel to | 


his evaluation, such appraisal does not prove 
4 
\ 


that psychotherapy is futile. The present 
evaluation of child psychotherapy, like its 
adult counterpart, fails to support the by- 
pothesis that treatment is effective, but it 
does not force the acceptance of a contrary 
hypothesis. The distinction is an important 
one, especially in view of the differences 
among the concerned studies, and their gen- 
erally poor caliber of methodology and analy- 
sis. Until additional evidence from well- 
planned investigations becomes available, 4 
cautious, tongue-in-cheek attitude toward 
child psychotherapy is recommended. 


Summary 


A survey of eighteen reports of evaluations 
at close, and seventeen at follow-up, was com- 
pared with similar evaluations of untreate' 
children. Two-thirds of the evaluations at 
close, and three-quarters at follow-up, Sbowe® > ¢ 
improvement. Roughly the same percentages 
were found for the respective control groups: 

A crude analysis indicates that time is a fa 
tor in improvement in the follow-up studies? 
the rate of improvement with time is ne” uf 
tively accelerating. Further analysis contre \ 
indicates the use of only two categories p 
evaluation. This scale tends to give ™UC 


, 
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lower rates of improvement than three-, four-, 
and five-point scales. 

It is concluded that the results of the pres- 
ent study fail to support the view that psy- 
chotherapy with “neurotic” children is effec- 
tive. 


Received August 20, 1956. 
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The acquisition by the child of normal sex- 
role behavior is a fundamental aspect of to- 
tal personality development and adjustment. 
There are two major reasons why a better un- 
derstanding is needed of the process by which 
a little girl comes to adopt the feminine role 
and learns how to be a “woman” and a little 
boy comes to adopt the masculine role and 
learns how to be a “man.” One reason is theo- 
retical and the other is practical. While there 
is an abundance of speculation, based largely 
on adults, concerning the nature and dynam- 
ics of sex-role adjustment, reviews of the lit- 
erature on sex differences (1, 11) show an 
almost complete absence of studies that spe- 
cifically deal with the problem of sex-role de- 
velopment in children. The practical need for 
data in this area comes from the increasing 
by workers in clinical psychology 
atry that difficulties or distortions 
t appear to be function- 
rsonality 
of emo- 
irect link 


recognition 
and psychi 
in sex-role adjustmen 
ally related to the occurrence of pe 
maladjustments and certain bag 
tional disorders. This sugges 

between childhood learning and = 
in sex-role behavior and adult personality dis- 


turb. 5 

een investigation (2) the ome 
reported on the development of a mascu ini 
femininity scale for use with =p <a 
the Zt Scale for Children (ITS)? Results or 
tained from the use of this scale wi 


É : þjects were pre- 
and 68 female kindergarten SUDA identifi- 


4 in terms 0 
Sented and discussed i in the accept- 


Cation process, sex aigerenees T isis ee 
ance of appropriate sex roles, 
t the an- 

1 This article is based on 2 a eee Associa- 
nual meeting of the ‘American Psycholog 
ti z zi 
ioe by Psychological Test Specialists, Box 
11, Grand Forks, North Dakota. 


roles, and homosexuality in relation to sex- 
role development. 


The Present Investigation 


The present paper represents an extension 
of the original research on the development 
of masculinity-femininity patterns in chil- 
dren (2). Whereas the initial investigation 
was concerned with a single age group from 
about 53 to 6} years, the present study in- 
volves a considerably larger age range, thus 
making possible the exploration of the factor 
of age in relation to sex-role adjustment, 


The Problem 


The concern of the present study is to pro- 
vide an analysis of the projected preferences 
of male and female children for various as- 
pects of the masculine and feminine roles, 
The concept, sex-role preference, refers to the 
degree that one or the other sex role is pre- 
ferred by the child and may be operationally 
defined on the basis of preferential responses 
of children to objects and figures that are 
typical of one sex in contrast to the other sex. 


Subjects 


Six hundred and thirteen children, 303 boys 
and 310 girls, between the ages of approxi- 
mately 53 and 114 were used as Ss and were 
tested in the spring of 1955.* These children 
constituted all pupils enrolled in classes from 
kindergarten through the fifth grade in the 
Pleasanton, California, Elementary School. 


3 Acknowledgment is made to Mr. W. V. Speaks, 
formerly A1/C, U.S.A.F., Parks Air Force Base, 
California, who administered the It Scale for Chil- 
dren to each of the 613 subjects. 

+ Acknowledgment is made to Mr. John C, Mann, 
Assistant Superintendent, to Mr. Thomas S. Hart, 
Principal, and to the teachers of.the Pleasanton Ele- 
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The mean IQs of classes, second through fifth 
grade, based on the California Short-Form 
Test of Mental Maturity, ranged from 103 to 
106. The socioeconomic status of the families 
of most of the Ss ranged from lower-middle 
to upper-middle. Approximately 33% were 
from families in which the father was on ac- 
tive duty at nearby Parks Air Force Base, 
and another 10% were frem families in which 
the father held a position with the Federal or 
state government (Military, VA Hospital, or 
Atomic Energy Laboratory). 


Procedure 


The Jt Scale for Children was given on an 
individual basis to each S. This scale is made 
up of 36 picture cards, 3 by 4, of objects and 
figures socially defined and identified with the 
masculine or feminine roles in our culture. 
The projective element in the It scale is a 
child-figure referred to as “It,” which is used 


mentary School, California, for their cooperation and 
assistance that made possible the present study. 


Daniel G. Brown 


to facilitate the child’s expression of his or 
her own role preference. The It-figure was in- 
tentionally drawn so that it would be am- 
biguous and relatively unstructured as to 
sexual identity. Each S, rather than being 
asked to choose directly, is asked to make 
choices for It. There are four major sections 
that comprise the scale: 


Toy Pictures Section, made up of sixteen Pictures, 
eight male objects (e.g., tractor and rifle) and eight 
female objects (e.g., doll and dishes) to which the 
child responds by having It make a total of eight 
choices. Each choice of a male item is scored one 
point, each choice of a female item is scored zero. 

Eight Paired Pictures Section, made up of eight 
pairs of pictures of masculine and feminine alterna- 
tives (e.g., Indian Chief and Indian Princess, Cos- 
metic Articles and Shaving Articles, etc.) to which 
the child responds by having It choose the one of 
each pair that It would rather be, have or wear. 
Each choice of a male item is scored cight points, 
each choice of a female item is scored zero, 

Four Child-Figures Section, made up of pictures 
of four children: a girl, a girlish boy (boy dressed 
as a girl), a boyish girl (girl dressed as a boy) and 
a boy, to which the child responds by having It 


Table 1 


Group Scores, Variability and Differences by Grade and Sex in Masculinity-Femininity Preference 


Grade and Sex N Mdn 


M SD CR* r* 
Kindergarten 
Boys 44 72.50 66.18 19.29 
Girls 46 41.16 42.50 27.93 4.65 2.10 
First Grade 
Boys 55 77.00 66.04 25.39 
Girls 73 72.00 52.07 33.72 2.66 1.76 
Second Grade 
Boys 52 81.16 77.58 17.17 
Girls 60 80.21 57.28 35.12 3.93 4.18 
Third Grade 
Boys 56 81.46 77.93 18.70 
Girls 58 79.97 59.02 32.92 3.75 3.10 
Fourth Grade 
Boys 51 81.23 75.98 20.15 
Girls 40 71.16 56.40 31.73 3.36 arse 
Fifth Grade 
Boys 45 80.87 76.73 17.05 
Gids 33 12.00 22.15 27.92 9.82 208 


* Values are significant at or beyond the .05 level. 
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choose the one It would rather be. Choice of the 
boy is scored twelve points, of the girlish boy eight 
points, of the boyish girl four points, and of the girl 
zero. 

Parental Role Section,’ which involves asking the 
child whether It would rather be a mother or a 
daddy when It grows up. 


Results 


Tables 1 and 2 contain all of the data used 
in the statistical analysis and results reported 
in the present paper. 


Group Sex-Role Patterns as a Whole 


Significant mean differences occur at each 
age level showing that boys score more mas- 
culine and girls more feminine (Table 1). 
This would be expected, since the It scale is 
composed of items associated with one sex in 
Contrast to the other sex. Since the scoring of 
the It scale is such that masculine choices are 
credited with points while feminine choices 
are not, a score of 84 represents an oe 
sively masculine preference throughout, whi e 
a score of zero represents an exclusively 
feminine preference. The lower mean score of 
girls compared to boys in each age pronn ii- 
dicates that boys score more masculine t an 
girls and, conversely, girls score more laa 
nine than boys. Even so, it may be noted in 
Table 1 that the median difference Da 
boys and girls in the first through thir gra e 
is quite small, indicating that ae E ten 
these grades score very masculine. E ae 
evident in various other types of ag has 
cussed below. Also indicated is p ae 
at each age level girls compared | a a fe 
Significantly more variable in thei 
Preference. 

A comparison betw 
dergarten sample of 
the kindergarten sam 


een test data of the kin- 
the present study with 
nple in the em a 
vestigation of 1953 (2) shá hat oni h 
Whole, there is very cbini ae cents 
Sex-role preference patterns of ie tae 
ples that are separated pce A as 
Years, that are different geogtap z a Ree 
that are different in terms of Dae a4 
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rs after the 
5 This section was padid to ma i =e ee 
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Boys show a predominantly masculine role 
preference at the kindergarten and first-grade 
levels (ages 54 to 73) and an even stronger 
masculine preference at the second-grade 
through the fifth-grade levels (ages 7 to 114). 
When boys in all of the grades are combined, 
about 63% respond with exclusive or near- 
exclusive masculine preference while only 
about 4% respond with an exclusive or near- 
exclusive feminine preference. 

In sharp contrast to the strong masculine 
role preference of boys, girls as a group do 
not show nearly the same degree of feminine 
role preference. At the kindergarten level, 
girls show what may be described as a 
“mixed” role pattern, i.e., one that is charac- 
terized by relatively equal preference of both 
masculine and feminine elements, Beginning 
at the first grade and extending through the 
fourth grade (ages from about 6$ to 103) a 
much stronger preference is expressed for as- 
pects of the masculine role than for the femi- 
nine role. When girls in all grades are com- 
bined, 40% score at or near an exclusively 
masculine score and about 17% score at or 
near an exclusively feminine score, 

A marked change in sex-role preference pat- 
terns occurs in girls in the fifth grade (age 
range from about 9 years 10 months to 11 
years 6 months with a median age of 10 years 
11 months). In contrast to girls in all earlier 
age groups, the fifth-grade girls appear to 
show much less preference for the masculine 
role and express instead a stronger and in- 
creased preference for things that are femi- 
nine. This apparent change in role preference 
in girls at this age level appears so marked 
that evidence from other studies of similar 
age groups should be established before the 
present finding is accepted. In addition, the 
sex-role preference of Ss in the sixth- and 
seventh-grade levels should be investigated. 
An interesting problem here is whether girls 
as a group become more feminine and less 
masculine in terms of preference just prior to 
and during the pubescent-adolescent period. 

Despite this apparent shift toward greater 
acceptance of the feminine role in fifth-grade 
girls, it is still necessary, to recognize that 
even in this age group a larger Percentage of 
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Table 2 
Percentages of Masculine and Feminine Responses of Boys and Girls to 
Various Sections of the Jt Scale for Children 
Total % Total % eight Total % four Total % 
toys paired pictures child figures parental role 
Grade and 
Sex Masc. Fem, Masc. Fem. Boy Mix. Girl Daddy Mother 
Kindergarten 
Boys 75 25 79 21 71 18 11 77 23 
Girls 58 42 50 50 37 30 33 52 48 
First Grade 
Boys 78 (22 78 (22 78 6 16 82 18 
Girls 66 34 61 39 59 11- 30 64 36 
Second Grade 
Boys 93 7 93 7 90 2 8 94 6 
Gitls 70 30 68 32 7 10 23 73 27 
Third Grade 
Boys 91 9 93 7 89 7 4 95 
Girls 70 30 70 30 72 2 26 72 e 
Fourth Grade 
Boys 90 10 91 9 88 4 8 
Girls 65 35 68 32 60 12 a A A 
Fifth Grade 
Boys 92 
Girls 7 6 A g 8&2 n 7 93 7 
73 12 12 %6 21 79 


girls compared to boys express a preference 
for the role of the opposite sex. 


Sectional and Item Analysis of the It Scale 
for Children 


Table 2 shows an analysis by sections of 
the It scale. 


Toy Pictures Section. From 15% to 93% of the 
toy item choices of boys in all age groups are for 
masculine objects. In contrast, only 30% to 42% of 
the toy item choices of girls, kindergarten through 
the fourth grade, are for feminine choices. In the 
case of fifth-grade girls, however, 63% of their 
choices are for feminine objects. 

Eight Paired Pictures Section. When this section 
is taken as a whole, from 78% to 93% of all the 
choices of boys in all groups are for masculine al- 
ternatives. For example, on the Indian item, from 
84% to 98% of the boys indicate that It would 
rather be the male Indian than the female Indian. 
In the case of girls from kindergarten through the 
fourth grade, only 3U% to 50% of the choices are 


for feminine alternatives. And on the Indian item, 
for example, only 25% to 35% indicate that It would 
rather be the female Indian than the male Indian. 
Fifth-grade girls, however, show a very different 
preference pattern in that 73% of their choices are 
for feminine alternatives, and in connection with the 
Indian item, 76% express a preference for It want- 
ing to be the female Indian rather than the male 
Indian. 

Four Child-Figures Section, When given an oppo- 
tunity to express a preference for the “kind” of child 
It would rather be, from 71% to 90% of the boys 
in all age groups indicate that It would rather be 4 
boy. Only 4% to 16% indicate that It would rather 
be a girl and from 2% to 18% indicate It would 
rather be a girlish boy or a boyish girl. On the other 
hand, only 23% to 33% of the girls from kinder- 
garten through the fourth grade indicate that It 
would rather be a girl, while 37% to 72% indicate 
a preference for It being a boy, and from 2% to 
307% indicate It would rather be a girlish boy or # 
boyish girl. These percentages are in sharp contrast 
to fifth-grade girls, 76% of whom express a pref- 
erence for It being a girl, only 12% for It being 4 
boy, and 12% for It being a boyish girl. 
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Parental Role Section. When asked whether It 
would rather be a mother or a daddy when It grows 
up, 77% to 95% of the boys in all age groups indi- 
cate a preference for It being a daddy, while only 
5% to 23% indicate a preference for It being a 
mother. In the case of girls from kindergarten 
through the fourth grade only 23% to 48% express 
a preference for It becoming a mother, while 52% 
to 77% express a preference for It becoming a daddy. 
A very different parental role preference pattern is 
evident in fifth-grade girls, however, in that 79% in- 
dicate It would rather be a mother and only 21% 
that It would rather be a daddy. 


Theoretical Implications 


The present finding that girls as a group do 
not show nearly the same degree of preference 
for the feminine role that boys show for the 
masculine role is consistent with the results of 
a number of studies of adults in which men 
and women were asked: “Have you sometimes 
wished you were of the opposite sex?” or “Tf 
you could be born over again, would you 
rather be a man or a woman?” or “Have you 
ever wished that you belonged to the opposite 
sex?” Three such investigations may be cited 
in this connection: Terman’s study (10) of 
792 married couples, the Fortune Survey (3) 
of 1946, and the Gallup Poll (4) of 1955. 
These studies reveal that only between 23% 
and 4% of adult males compared to between 
20% and 31% of adult females in our culture 
state that they have been aware of the desire 
to be of the opposite sex. This sex difference 
in adults, showing five to twelve times - 
many women as men having been conscio 1s 
of the desire to be of the opposite Fig 
paralleled by results of the present ee he r 
sponses to the Parental Role Seen es “te 
scale may be taken as an example ti a 
(Table 2). At the kindergarten pei i 
than twice as many girls as boy : a ne 
preference for the parental oe ary 
site sex (52% compared to 25 between three 
the first through the fifth gene e cc a 
and twelve times as many §it'S ~ ae oi the 
a preference for the role of the p: 
eater poste in relation to sll 
Present findings are as ee apes on 
sexuality. Although based poe convincing 
Clinical studies of ane nd th t points to 
evidence has been established poet in- 
a functional relationship betwee 
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version and certain forms of homosexuality 
(5, 6, 7, 8, 9). In fact, adult, passive, femi- 
nine male homosexuals and active, masculine 
female homosexuals almost invariably have 
childhoods in which sex-role inversion is a 
prominent feature and in which the usual 
parent-child relationship is such that for one 
reason or another the child is unable to form 
a close identification linkage to the parent of 
the same sex together with having formed an 
excessively strong attachment to the parent of 
the opposite sex. This points to the assump- 
tion that the inverted personality comes from 
a family constellation in which the bond be- 
tween mother and son or father and daughter 
was abnormally strong, while that between 
father and son or mother and daughter was 
ineffective, weak or nonexistent. In such cases 
the mother is the identification ideal of the 
son, while the father serves as such an ideal 
in the case of the daughter. 

As far as the present study is concerned the 
occurrence of opposite sex-role preference is 
much more common in girls than in boys. 
This does not mean, however, that all girls 
who show a predominant preference for the 
masculine role are necessarily developing in- 
verted personalities. The essential basis of 
inversion (i.e., the process in which an in- 
dividual adopts the psychological identity 
typical of the opposite sex) appears to be 
an early, continuing, emotionally deep-rooted 
identification with, as well as preference for, 
the sex-role of the opposite sex. Expressed 
preference per se for the role of the opposite 
sex may or may not be based on identifica- 
tion with that role (2). Thus, for example, 
if a girl’s basic and underlying identification 
is with the feminine role, the fact that she 
may show a preference for the masculine role 
during childhood does not necessarily indi- 
cate sex-role inversion. It is quite likely that 
many girls prefer much that is associated with 
the masculine role without having formed a 
fundamental identification with that role. 
Furthermore, in our culture, girls are allowed 
and often encouraged to participate in tasks 
and activities that are typical of boys. Girls 
may wear shirts and trousers for example 
even though such clothing is typically identi- 
fied with the male. The converse is not true 
in the case of boys. Thus, severe social cen- 
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sure would result if boys wore dresses or in 
other ways impersonated girls. There are 
many other areas in which girls, in contrast 
to boys, are permitted to take part in activi- 
ties characteristic of the opposite sex. Never- 
theless, if a girl has made a basic and primary 
identification with and continues to prefer the 
masculine role through childhood and into 
adolescence (and conversely with the boy) 
sex-role inversion in adulthood would be ex- 
pected, one aspect of which would be a homo- 
sexual object choice. This whole problem has 
been discussed in greater detail in another 
paper (12). 
Summary 


A masculinity-femininity test, the Z4 Scale 
for Children, was administered to 303 boys 
and 310 girls between the ages of approxi- 
mately 54 and 114. These Ss were enrolled in 
classes from kindergarten through the fifth 
grade in the Pleasanton, California, Elemen- 
tary School. The It scale is made up of pic- 
tures of various objects and figures typical of 
and associated with the role of one sex in 
contrast to the role of the other sex. A child- 
figure drawing, referred to as “Tt,” is used by 
having each S make choices for It. The find- 
ings are as follows: 

1. In each age group the mean score of 
boys is significantly more masculine than the 
mean score of girls and, conversely, the mean 
score of girls is significantly more feminine 
than that of boys. This would be expected 
since the It scale is composed of masculine 
and feminine alternatives, culturally defined, 

2. Girls in all age groups are significantly 
more variable than boys in their sex-role pref- 
erence. 

3. Boys show a much stronger preference 
for the masculine role than girls show for the 
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feminine role, particularly from kindergartibn 
through the fourth grade. 

4. Girls at the kindergarten level show a 
preference pattern characterized by relatively 
equal preference for masculine and feminine 
elements. 

5. Girls from the first grade through the 
fourth grade show a stronger preference for 
the masculine role than for the feminine role. 

6. In contrast to girls in all earlier grade 
levels, girls in the fifth grade show a pre- 
dominant preference for the feminine role. 

The implications of these findings are re- 
lated to adult studies of opposite sex-role pref- 
erence, to inversion, and to homosexuality. 
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Some Correlates of Affective Tone of Early Memories 


June Elizabeth Chance 


University of North Carolina 


The present investigation is an attempt to 
test the idea that affective tone of the earliest 
memory which an individual can report will 
reflect a general trend in his personality or- 
ganization. More specifically, it is hypothe- 
sized that the pleasantness or unpleasantness 
of this memory is predictably related to the 
tendency to recall a proportionately greater 
number of successes or failures in an experi- 
mental task, and that the character of the 
first memory (a form of self-report) is re- 
lated to responses given to a self-report per- 


Sonality instrument. 


Procedure 


The subjects of this study were 78 under- 
graduate students enrolled in a psychology 
course. For a majority, it was their first course 
in psychology. The experiment was conducted 
during a regular class meeting and took about 
40 minutes. The Ss were first given a sheet of 
Paper on which appeared 32 anagrams. oe 
scrambled words varied in length from 4 ig 
letters; all were very common English words. 


The Ss were told: 


We are trying to construct a brief taor DE 
Bence which will be especially appropainie k Sass 
lege students like yourselves. These ae tes 
Which we believe might be Lo ha ais pas 
but we need to know more about ek aby 
on them carefully and do the bes SP aed 

Fat mon pre 19 OD A sagt A Do not use for- 
of letters so that they ee set of letters will 
f: er ; 

A ae word if prope peos 
You will have fifteen minutes in w: ic! 


ahead, 


At the end of 15 
Collected. While the Ss hen, bi 
experimenter transcribed fo e i 
to the blackboard. Now the 


minutes the papers were 
en working, the 
st of anagrams 
orrect solution 


for each anagram was written next to it. 
Solutions were left on the board for a few 
minutes and then erased. 

Blank sheets of paper were then Passed out. 
The Ss were asked to think carefully for a 
few minutes, and then to write as complete a 
description as possible of their first childhood 
memory. After they finished their free de- 
scriptions, they were requested to answer the 
following questions which were written on the 
blackboard. (a) Is this memory primarily a 
pleasant or unpleasant one for you? (b) Ap- 
proximately how old do you think you were 
at the time this event occurred? (c) Were you 
talking well, i.e., able to make most of your 
needs known verbally, at the time of this 
event? (d) Is there some Possibility that what 
you have described could be a dream or some- 
thing which someone else told you about 
sometime after it happened rather than a 
memory? 

When the questions had been answered, 
papers were collected and another blank sheet 
was passed out. The Ss were then asked to 
write as many words from the list of ana- 
grams (solutions) as they could recall. These 
papers were collected at the end of 10 min- 
utes. The group was thanked for their co- 
operation and dismissed. 

A few weeks earlier an abbreviated group 
form of the MMPI had been administered to 
this group by another experimenter, This 
form included all of the items of the Welsh A 
and R scales. Welsh (2), on the basis of his 
work in developing the scales, has suggested 
that the A scale measures general maladjust- 
ment with anxiety and dysphoria as the most 
prominent features, while R measures a tend- 
ency toward denial and repression. 


1 The author wishes to thank Mr. 
for making these data available, 


Carl Cochrane 
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Hypotheses 


The hypotheses to be tested in this study 
were the following: (a) Ss who report a 
pleasantly-toned early memory tend to recall 
relatively more successes than failures from 
the anagrams task as contrasted with Ss who 
report an unpleasantly-toned memory. (b) Ss 
who report a pleasantly-toned early memory 
tend to have R scores higher than A scores, 
while Ss who report an unpleasantly-toned 
memory tend to have A scores higher than R 
scores. 


Treatment of Data and Results 


Since different Ss successfully completed 
different numbers of anagrams, an analysis 
based on the absolute numbers of successes 
and failures recalled would be misleading. 
Consequently, a score for the number of suc- 
cesses was computed by dividing the number 
of successes recalled by the number of suc- 
cesses obtained on the original task. A simi- 
lar operation was performed to obtain a 
score for recall of failures. These scores are 
designated PRS and PRF, respectively. 

Since PRS and PRF are proportions, the 
scores of an S who had either solved most of 
the anagrams or very few of them might be 

relatively unreliable. For this reason, data 
from Ss who had solved fewer than 6 or more 
than 26 anagrams were eliminated from the 
study. Also data of Ss who expressed doubt as 
to the validity of the memory they gave were 
eliminated. Thirteen Ss of the original group 
were not included in the analysis for these 
two reasons. The data of the remaining 78 
are presented here. 

Pleasantly-toned memories were reported 
by 38 Ss; unpleasantly-toned memories, by 
40. These proportions correspond fairly well 
to those previously reported in the literature 
(1). Ages reported by Ss at which they be- 
lieved the recalled event to have occurred 
varied from 1 to 6 years, with a median of 3 
years. Only 4 Ss believed that they were not 
talking to some degree at the time the re- 
called event took place. Memories ranged 
from complex social interactions such as be- 
ing ostracized or approved by other children, 
through painful accidents and being fright- 
ened, to simple but vivid sensory impressions 
like colors and foods. The S’s categorization 
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of the memory as pleasant or unpleasant was 
accepted in all cases, regardless of content. 

In order to test the first hypothesis, PRF 
was subtracted from PRS for each S. (Only 6 
of 78 Ss recalled more failures than successes} 
4 of these 6 were in the unpleasant memory 
group.) For the resulting values a test of the 
mean difference between the pleasant and un- 
pleasant memory groups was computed. The 
difference was in the predicted direction; the 
value of ¢ obtained was 2.35 (76 df, .05 > 
p> .02). 

To test the second hypothesis, three tests 
were made. First, the mean difference between 
the pleasant and unpleasant memory groups 
in A scores (converted to standard scores on 
the basis of a table presented by Welsh) was 
computed. The value of ¢ obtained was 1.69 
(76 df, 10> p> .05). While the unpleasant 
memory group had A scores slightly higher 
than those of the pleasant memory group, the 
result is not statistically significant. 

Next, the mean difference between the pleas- 
ant and unpleasant memory groups in R 
scores (standard scores) was computed. The 
value of # equalled 2.08 (df 76, 05 > p > 
.02). This difference was significant and in 
the direction predicted, i.e. the pleasant 
memory group had R scores higher than those 
of the unpleasant memory group. 

Third, a chi-square test of the relationship 
between individuals’ A and R scores was com- 
puted (Table 1). Cases were tabulated within 
each memory group depending upon whether 
the A score exceeded the R score or vice 
versa. The value obtained was 8.62 (1 df, 01 
> p > .001) which is highly significant. Thus 
individuals in the pleasant memory grouP 
were more likely to have R scores higher than 
their A scores, while those in the unpleasant 


Table 1 


Chi-Square Test of the Relationship Between Affective 
Tone of Early Memory and Discrepancy 
Between A and R Scores 


Group A>R R>A Total 
Pleasant memory group 14 24 3 
Unpleasant memory group 28 12 ia 
Total 42 36 7 


x? = 8.62; df = 1; .01 >p > .001. 
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memory group were more likely to have A 
scores higher than their R scores. 

Tests of the mean difference in numbers of 
anagrams originally solved and total number 
of solutions recalled by the two memory 
groups were not significant. 


Discussion 


The study should be considered as a prob- 
lem in selective retention rather than repres- 
sion in the psychoanalytic sense. The con- 
firmation of the first of the hypotheses sug- 
gests that the affective character of what the 
individual reports as his first memory does 
bear a consistent relationship to his tendency 
to recall other things which have some affec- 
tive loading attached. The findings regarding 
the second hypothesis indicate some consist- 
ency in self-report, particularly with respect 
to denial or tendency to ignore that which 
might be discomforting to the person. In a 
peripheral way, the test of the second hy- 
pothesis might be considered to be a valida- 


tion on R scale. 
Summary 


Seventy-eight college students were asked to 
solve a list of anagrams presented as part of 
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an intelligence test in the process of construc- 
tion. They were allowed sufficient time to 
complete only a part of the list. They were 
then given the solutions to all of the ana- 
grams. Next, they were asked to describe the 
earliest childhood memory they could recall 
and to state whether this memory was pleas- 
ant or unpleasant. At the end of the experi- 
mental session they were asked to recall as 
many words from the original anagrams list 
as they could. A hypothesis regarding affective 
tone of the memory and relative tendency to 
recall successes or failures was confirmed. A 
second hypothesis of the relationship between 
character of the memory and scores on the 
Welsh A and R scales (MMPI) was also con- 
firmed. 
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Psychological Control Patterns Within Families”? 


Joseph Luft 
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The purpose of this study was to determine 
whether children varied as their parents did 
in psychological control. The construct, psy- 
chological control, refers to the manner in 
which the individual regulates inner and outer 
tensions or stress. Running on a continuum 
from constriction and inhibition at one end to 
impulsiveness and expressiveness at the other, 
psychological control as a central personality 
variable tells us how the individual manages 
his feelings, how he relates to others and to 
himself, his attitudes toward authority and 
toward ambiguity. 

After preliminary exploration of experimen- 
tal and clinical methods for measuring psy- 
chological control, an 82-item questionnaire 
was developed. The items cover a variety of 
everyday situations and deal with personal 
and interpersonal attitudes judged to be be- 
havioral manifestations of psychological con- 
trol. The independent judgment of four clini- 
cal psychologists, when in unanimous agree- 
ment, was the basis for the original selection 
of the items. 


+ An extended report of this study may be ob- 
tained without charge from Joseph Luft, San Fran- 
cisco State College, San Francisco 27, California, or 
for a fee from the American Documentation Insti- 
tute. Order Document No. 5105, remitting $2.00 for 
microfilm or $3.75 for photocopies. 

2? This investigation was supported (in part) by a 
research grant (M-528) from the National Institute 
of Mental Health of the National Institutes of 
Health, Public Health Service. The study was car- 
ried out at Stanford University. Jeanne Block, Bar- 
clay Martin and Eva Shippee collaborated in this 


research. 


Supporting the validity of the instrument, 
significant correlations were found between 
scores on the psychological control question- 
naire and on independent measures such as 
the Berkeley F scale, expansiveness on the 
draw-a-person test, favorable attitude toward 
self as measured on an adjective checklist and 
teachers’ ratings of cooperativeness. Reliabil- 
ity measures varied from .72 to .91. 

The questionnaires were administered to 79 
boys (mean age 14.5) and to their parents, 
and to 25 girls (mean age 15.0) and to their 
parents. 

Results suggested that there was a signifi- 
cant relationship between parents and sons in 
Psychological control, somewhat higher be- 
tween sons and mothers than between sons 
and fathers. Fathers in general ran higher 
scores than mothers. When parents varied 
sharply with each other on psychological con- 
trol scores, the sons’ scores were not signifi- 
cantly different from mean scores for boys in 
general, 

Results for girls and their parents were 
quite erratic, and in general were not pre- 
dicted. A high negative correlation (— .60) 
at the .001 level between fathers and daugh- 
ters was noted, while mothers and daughters 
were not significantly related on psychologi- 
cal control scores. 6 

These results point to a need for more di- 
rect and accurate measurement of ego func- 
tioning within the family setting. 


Brief Report. 
Received January 7, 1957. 
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A Typology of Tests, Projective and Otherwise 


Donald T. Campbell 


Northwestern University 


To judge from the pages of this and similar 
journals, the projective test movement is here 
to stay. But the rubric projective, once useful 
in mobilizing a reaction against older diag- 
nostic procedures, has been stretched to in- 
clude such a heterogeneous variety of meas- 
ures that its denotational value has become 
attenuated. In a paper on indirect attitude 
measurement (5), the present writer proposed 
a typology of test formats which has turned 
out to be one of the most cited features of 
that paper. Since that typology is now deemed 
to be inadequate, and since the denotational 
Problem still remains, this second effort is of- 
fered. From an initial focus on personality 
tests comes three dichotomies, which generate 
eight test types. Five of these types contain 
tests commonly regarded as “projective,” and 
only two are unrelated to personality meas- 


urement. 
The Three Dichotomies 


1. Voluntary vs. objective. In the voluntary 


test the respondent? is given to understand 
that any answer is acceptable, and that there 
is no external criterion of correctness against 
which his answers will be evaluated. He is en- 
couraged in idiosyncrasy and self description. 
The test assignment may state “this is not Aun, 
of your ability,” or “there are no right or wrong a - 
Swers” or “answer in terms of how yox rea y Be 
In contrast, in an objective test the aie ed 
either explicitly or implicitly, that there is 
= : 
1 The term respondent is porto ia San Eo 


chology to refer to the person irí s t 
collected, and whose personality is being examined. 


he term subject, or S, aS ney peer 
Mental psychology reports 1s felt to ae tiray 
Mappropriate. More clinically onena oe ae Bs 
batient or client, are too specialize | to Reng a 
ull range of respondents employed in p 


Search, 


answer external to himself, for which he should 
search in selecting his answer. The concepts of “ac- 
curacy” and “error” are in the subject’s mind. Phe- 
nomenologically he is describing the external, objec- 
tive world, although in so doing he is inevitably 
reflecting his idiosyncratic view of that world, and 
can be unselfconsciously “projecting” in an impor- 
tant meaning of that word, as will be illustrated 
more fully in the discussion of test types 5 and 6 
below.? 


2. Indirect vs. direct. In the direct test, the 
respondent’s understanding of the purpose of 
the test and the psychologist’s understanding 
are in agreement. Were the respondent to read 
the psychologist’s report of the test results, 
none of the topics introduced would surprise 
him. This is obviously so in an achievement 
test given at the end of a course. It is equally 
true for the typical public opinion poll. It is 
probably so for the usual diagnostic interview, 
It is so for many interest tests and adjust- 
ment inventories. 


* The distinction here made partially overlaps the 
discussion by Rosenzweig (19) on levels of response 
to projective tests. His category subjective clearly 
belongs with the voluntary class as here defined, 
emphasizing the subject’s self-conscious focus on 
describing himself. In his projective category, the 
respondent looks away from himself at some “ego- 
neutral” object, as in the phenomenologically objec- 
tive orientation of this paper. His category called 
objective is not the same as the present usage, but 
refers to the psychologist’s orientation, and includes 
a behavior-sampling approach not relevant to the 
present discussion. The distinction is also related to 
Cattell’s discussion of varieties of projective tests 
(8). When he classifies certain approaches as a va- 
riety of objective tests employing misperception, his 
usage is in agreement with that of the present writer, 
This agreement is limited, however, and on many 
points the analyses differ, as when Cattell places 
the TAT and the Tautophone in the same subtype. 
In the present analysis, the TAT is voluntary, and 
the Tautophone the most classic example of the 
objective assignment among projective personality 
measures. 
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In the indirect test, the psychologist interprets the 
responses in terms of dimensions and categories dif- 
ferent from those held in mind by the respondent 
while answering. If a person tells stories to pictures 
under the belief that his thematic creativity is being 
measured, and the psychologist then interprets the 
products as depth projections, the test is indirect. If 
a person expresses his likes and dislikes about a series 
of drawings, and is as a result classified as an oral- 
pessimist, the test is indirect. If a respondent believes 
he is participating in a general survey of the public’s 
opinion on a variety of harmless issues, and gets 
scored, however invalidly, as a paranoid proto-fascist, 
the test is indirect. In general, whenever responses are 
taken as symptoms, rather than as literal information, 
the test is indirect. 

Characteristic of the indirect test is a fagade. By 
this is meant a false assignment to the respondent 
which distracts him from recognizing the test’s true 
purpose and which provides him with a plausible 
reason for cooperating. Initially the TAT had such 
a fagade: “This is a test of your creative imagina- 
tion.” The objective test fagade is used in an impor- 
tant class of indirect tests of social attitudes, in which 

the respondent tries to show his knowledge of cur- 
rent events and is scored for the bias he shows in the 
directionality of his errors (5, 14). The expression of 
aesthetic taste, participation in public opinion sur- 
veys, judgments of moral right and Wrong, and judg- 
ments of logical consistency all have been used as 
fagades in indirect tests of personality, interests, or 
social attitudes. 

The potential ethical problems arising from the ap- 
plication of personality and attitude tests in adminis- 
trative situations (e.g. 21) would seem to center 
around this one dimension of indirection. 


3. Free-response vs. structured. This di- 
chotomy is already well established in the 
classification of personality and attitude as- 
sessment procedures. Typically, the projective 
tests have been open-ended, free, unstruc- 
tured, and have had the virtue of allowing 
the respondent to project his own organiza- 
tion upon the material. 


The free-response format has the advantage of not 
suggesting answers or alternatives to the respondent, 
of not limiting the range of alternatives available, 
nor of artificially expanding it through the sugges- 
tions provided in the prepared alternatives. In the 
multiple-choice Rorschach the respondent can see 
images pointed out to him by the prepared alterna- 
tives which he would never have noticed on his own. 
The structured format was typical of the personality 
and attitude measurement devices of the first flower- 
ing of such tests in the period from 1920 to 1935, 
and hence provides the tradition against which both 
the projective test movement and modern survey re- 
search techniques were revolting (e.g., 11). But even 
this dimension is not, uniquely associated with pro- 
jective tests, as the two earliest papers in English 
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using the concept of projection in a testing setting 
(15, 7) used structured response formats, 


The Eight Test Types 


1. Voluntary, indirect, Jree-response. These 
are the classic projective techniques, including 
free association, the Rorschach, the Thematic 
Apperception Test, doll play, drawing, and 
such projective questions as, “What do you 
admire most in people?” or “What is the 
most embarrassing thing you can think of?” 
It is in this category that most inventiveness 
has been shown, and the present paper makes 
no effort to cite even a fraction of the ap- 
propriate studies.* 

2. Voluntary, indirect, structured. In this 
category would be found the multiple-choice 
Rorschach and multiple-choice association 
tests. In addition, indirect questionnaires 
would fall in this cell, such as the F scale 
measure of authoritarian personality trends 
(1). Where Osgood’s ‘semantic differential 
(16) is used to measure indirectly attitudes 
toward parents, and other important figures, 
it belongs in this category, as would a Q-sort 
approach (10) to unconscious identification, 
for example. Humor tests and annoyance in- 
ventories used for indirect diagnostic purposes 
also belong here. The Barron-Welsh art pref- 
erence test (3) is another good example of 
this category. The Blacky test (4) contains 
both free-response and structured features, 
and in part belongs here. 

3. Voluntary, direct, free-response. This 
category is epitomized by sentence-comple- 
tion tests, essay-type questionnaires, the auto- 
biographical assignment frequently given in 
Personality research, and the open-ended in- 
terview in public opinion surveys. Of these 
the sentence completion tests, at least, are 
commonly regarded as projective, but are clas- 
sified here in the belief that rarely is the re- 
spondent unaware that he has been revealing 
his own attitudes, 

4. Voluntary, direct, structured. This cate- 


3 No effort has been made to provide bibliographi- 
cal references for the well-known projective tests. 
Such references are available in a number of sources 
(eg., 2). Where, in the effort to supply adequate 
illustrations of each category, a less well-known test 
is cited, representative bibliographical references are 
provided. 
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gory would include the classic quantitative 
efforts to measure adjustment, personality, 
interests, and attitudes, including the Wood- 
worth inventory, the Thurstone and Likert 
attitude tests, the Strong and Kuder interest 
inventories, the Bernreuter, the MMPI, the 
many biographical inventories, and many 
others. When these are scored with an em- 
Pirical key, or are presented in a forced- 
choice format, the respondent may be in the 
dark about the psychologist’s interpretation 
of a particular response. But even in these 
instances, the topics and dimensions of inter- 
pretation used by the psychologist are still 
congruent with the purposes of the test as 
understood by the respondent. Since in gen- 
eral most test constructors tend to introduce 
some efforts at disguise, were too strict an in- 
terpretation placed upon this dimension, the 
category of direct tests would be very small 
indeed. 

5. Objective, indirect, free-response. These 
are projective tests using the objective test 
facade, focusing the respondent’s attention on 
the external world but allowing an unstruc- 
tured response situation. The oldest and most 
used of the projective tests in this category is 
the “Verbal Summator” or “Tautophone” (13, 
19). A recording of indistinct vowel sounds 
is presented with some such instructions as 
“This is a recording of a man talking. He is 
not speaking very plainly, but if you listen 
carefully you will be able to tell what De is 
saying. I’ll play it over and over oe so 
that you can get it, but be sure to ha me 5 
soon as you have an idea of what he y - 
ing.” Subjects almost unanimously ae 5 
facade and produce intelligible verbal co 
tent which they are totally pa a 
from themselves. It should be noted that 
are also several auditory apperrep on t a 
ditory association methods which should n 

i hone and which 
be confused with the Tautop 


belong clearly in category J. 
i i i ther 
Sherriffs’ Intuition gece go S another 
i eee e for the behavior 


i ble ex t i 
indicated i enor the following excerpts from life 


u- 
istori dom sample of the pop 
histories taken from a eat underlying the behav- 


lati the moti nderl brar 
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nique has seen very successful appiarinn 5 non 
(12) in the measurement of achievemen 
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tion motives. Day’s (9) task of asking respondents 
to explain a sample of day dreams is similar. 

Rechtschaffen and Mednick’s Autokinetic Word 
Technique (17) belongs here. In the autokinetic 
illusion, a single dot of light in an otherwise totally 
dark room appears to move. They presented a ran- 
dom series of exposure periods and told the respond- 
ents that words were being written by the point of 
light, which they were to report. All of their re- 
spondents “saw” words, and when told about the 
nature of the experiment were shocked to learn that 
they had themselves fabricated the content. 

The assignment to judge the character of persons 
presented in photographs can be presented as an ob- 
jective task. Common statements in psychology texts 
as to the impossibility of making valid judgments of 
this kind may create a problem, although with a 
properly prepared set of materials (e.g, 6) the phe- 
nomenological validity of the task is great enough 
so that even college students will accept it as a 
legitimate objective task. 

6. Objective, indirect, structured. For many 
of the test formats of type 5, structured forms 
could be prepared. The trait judgments from 
the photographs assignment has been used in 
both free-response and structured forms (e.c., 
6), and indeed used a structured response in 
its first application by Murray (15). The 
much used error-choice approach to attitude 
measurement (14) is another typical example 
of this category. 

7. Objective, direct, free-response; and 

8. Objective, direct, structured. The three 
dichotomies which have generated the above 
six types of personality test, produce these 
two remaining categories. These turn out to 
be the typical tests of ability or achievement, 
in free-response and in structured form, the 
latter category being characteristic of almost 
all standardized tests in these areas. 


Summary 


From a consideration of differences among 
personality measurement approaches, three 
dichotomies or dimensions of distinction have 
been drawn. The joint application of these 
generates eight test types. Of these, six are 
appropriate to the field of personality meas- 
urement. These are: 


. Voluntary, Indirect, Free-Response. 
Voluntary, Indirect, Structured. 
Voluntary, Direct. Free-Response. 
Voluntary, Direct, Structured. 
Objective, Indirect, Free-Response. 
Objective, Indirect, Structured, 
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Examples are provided for all six types. While 
category 1 contains the most typical projec- 
tive tests, tests called projective are found in 
all except category 4. Categories 5 and 6 are 
the least developed but should be given par- 
ticular attention, as they can involve the un- 
selfconscious projection of personality content 
upon the phenomenologically objective envi- 
ronment. 


Received September 20, 1956. 
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The Rorschach test was introduced 35 years 
ago, but although a considerable amount of 
research has been published on it, the situa- 
tion is still such that its reliability is undeter- 
mined and its validity suspect. The major sup- 
port for the test comes from its extensive use 
in the clinic and from studies which indicate 
that, used globally, it has some value. What it 
is in the test that works, to what extent its 
value is a function of the test per se and of 
the observations and interview material as- 
sociated with it, and why particular scores, in 
isolation or in patterns, measure personality 
in the way they do, if they do, still remains 
to be determined. 

It is sometimes argued that the Rorschach 
test has been insufficiently experimentally 
validated because there are no adequate cri- 
teria of personality against which to validate 
it. However, a first step is indicated. Bo 
ality, whatever else it is, involves the concep 
of stable individual differences through time. 
If everyone were the same as everyone ise 
there would be no need for the word, and i 


everyone varied haphazardly from day to iy, 
a table of random numbers would De p$ good 
a measure as any. Consequently, : e inos 
fundamental requirement of a test of perso 
nstrate relatively stable 


ality is that it demo tiv 
individual differences in whatever it pupon 
to measure. If it cannot do so, there A 
point in attempting to validate it. In a sense, 


1 This report is 
data obtained by 
thesis submitted madai . 
a iversity ass I 
E out the additional analysis under 
the direction of the senior author. 


this is merely stating that validity cannot be 
established without test-retest reliability, but 
in light of Rorschach development, this view 
requires emphasis. 

In the present study an attempt is made to 
determine whether responses to inkblots pro- 
vide an adequate measure of stable individual 
differences. In order to obtain several sam- 
ples of responses without repeating the same 
test, it was necessary to construct new sets of 
inkblots. Consequently, this study cannot be 
considered a Rorschach study except to the 
extent that the blots in the Rorschach have 
something in common with inkblots in general. 


Method 
Subjects 


In that the research plan required repeated 
testing of the same subjects, it was necessary 
for practical reasons to keep the sample small. 
Accordingly, 16 volunteer sophomores, equally 
divided among males and females, who were 
enrolled in an introductory course in psychol- 
ogy at the University of Massachusetts, were 
used as subjects (Ss). 


Materials and Apparatus 


The testing materials consisted of 100 cards 
6 by 5 inches with symmetrical inkblot de- 
signs on them. The cards were randomly as- 
signed to ten sets of ten cards each. 


Each card consisted of a large achromatic blot 
above or below a large chromatic blot, a small chro- 
matic blot on both sides of the large achromatic 
blot, and a small achromatic blot on both sides of 
the large chromatic blot. The large achromatic blot 
was internally differentiated by superimposing sey- 
eral layers of black ink; the large chromatic blot by 
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the use of different colors. The small blots were of 
single colors. In addition to black, each card con- 
tained the colors red, green, brown, and blue, which 
were produced by the application of standard inks. 
An example of a test card is shown in Fig. 1. Within 
sets, the cards were alternately arranged so that half 
the time the large chromatic blot appeared on top 
and half the time on bottom. The cards were de- 
signed in the manner described as the arrangement 
was esthetically pleasing, offered an opportunity for 
several responses per card, provided a relatively equal 
opportunity for responses to colored and noncolored 
areas, and made it possible to produce parallel sets 
composed of any number of cards without undue 
concern about the characteristics of the individual 
cards. 

Location charts, to enable the Ss to indicate to 
what areas they were responding, were made by 
tracing outlines of the blots on mimeograph stencils. 
Standard response sheets provided a place for the 
entry of cach response. An opaque projector was 
used to present the blots. 


Procedure 


The Ss were seated within a small desig- 
nated area in order to keep the projected 
stimulus relatively constant. The room was 
dimly illuminated on the sides to provide 
sufficient light for writing. Preceding the first 
session, the following explanation was given: 
“This investigation is concerned with the de- 
termination of what different people see in 
inkblots. It will be necessary to use a great 
many inkblots, and so ten sessions will be re- 
quired. There will be two sessions each week 


Fig. 1. Example of one of the inkblot cards. 
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for the next five weeks. At every session a 
series of inkblots will be presented and each 
of you will be requested to record what you 
see in the blots.” Within the five-week pe- 
riod the sessions were approximately equally 
spaced. The group was instructed to write 
three responses for each card, which was pre- 
sented for three minutes, and to identify the 
responses on the location sheet. (An attempt 
to obtain an inquiry by following the Group 
Rorschach procedure [4] was abandoned after 
pretesting indicated that many Ss failed to 
comprehend what was expected of them.) 


Scoring 


For the most part the protocols were scored 
by the Klopfer (5) procedure. Shading was 
not scored as the incidence of responses in 
which it clearly entered was too low for 
evaluation, possibly as a result of the ab- 
sence of an inquiry. In scoring color responses, 
color was presumed to have influenced the 
percept if it was directly described as colored 
(e.g., “a blue butterfly”) or if both the con- 
tent and the area strongly indicated color 
(e.g., “blood” to a red area). Klopfer scores 
that were utilized without revision were: M, 
FM, m, F, FC, CF, C, 3C, H, Hd, A, Ad, S. 
Following is a list of the remainder of the 
scores: 


W: Responses to the entire large chromatic or 
achromatic blots. 
w: Responses to the entire small chromatic or 
achromatic blots. 
D: Responses to major differentiated areas of the 
large blots. 
dd: Responses to minute or undifferentiated areas. 
CA: Responses to chromatic areas. 
AA: Responses to achromatic areas, 


Form-level: The sum of Weights assigned to Te- 
sponses according to the degree of differentiation, 
integration, and elaboration involved. (Initially, an 
attempt had been made to consider form accuracy» 
but it soon became apparent that without an inquiry 
such judgments were too subjective.) A basal score 
was assigned by weighting a rejection as 0 (30 re- 
sponses were required), vague form as 1.0 (eg» 
“cloud,” “explosion”), simple form as 1.5 (¢8» 
“drop of water,” “leg”), and complex form as 2.2 
(eg, “human,” “animal”). An additional weight of 
-5 was added for each of the following: movement, 
appropriate use of color, integration of two or mes 
separate areas, or any other appropriate elaboration. 
A weight of .5 was subtracted for any inappropriate 
elaboration. The maximum weight allowed for any 
single response was 3.5. 
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CF +C: The arithmetical sum of CF and C re- 
sponses. These scores were combined because the fre- 
quency of either, alone, was too low for satisfactory 
evaluation. 

M-+ FM +m: This sum was investigated in order 
to determine whether the combined activity responses 
were a more satisfactory measure of individual dif- 
ferences then the separate components. There is as 
much intrinsic logic to a combined activity score as 
there is to a combined color score. 

M:(=C+M): This ratio was computed in place 
of the Rorschach M:=C ratio as it is statistically 
more stable. 

(M+ FM +m):(2C+M+FM + m): This ratio 
Was investigated in order to determine whether a 
combined activity score in relation to a combined 
color score was a more effective measure of indi- 


vidual differences than the preceding ratio. , 
Anx: The Elizur (3) content scale of anxiety. 
Hst: The Elizur content scale of hostility. 


Statistical Analysis 

Each of the scores was treated by a double 
classification analysis of variance which per- 
mitted an evaluation of differences between in- 
dividuals and between testing sessions. Those 
scores whose distributions were markedly 
skewed were transformed by adding .5 to each 
score and extracting the square root of the 
sum. Reliability coefficients were computed by 
using the ratio of the variance between Ss 
minus the error variance, to the variance be- 
tween Ss plus (k — 1) error variance, where 
hk is the number of sessions (6). These coeffi- 
cients reflect the degree to which a score 
measures variance due to individual differ- 
ences relative to error variance. The signifi- 
cance of the reliability coefficients was dete» 
mined by the F ratio of the variance between 


Ss to the error variance. 


Results 


Individual Differences 


Separate analyses of 
formed for the data 7 a 
for the data on the first and sé i 
Table 1 presents the reliability ascites 
derived from these analyses. For sgh ts 
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individual differences indicates that responses 
to inkblots tell something about individuals, 
but does not indicate the degree to which 
they do. This can be determined by the mag- 
nitude of the reliability coefficients in Table 1; 
which are seen to vary from .20 to .56 when 
based on all the data. The median coefficient 
is .40. The score combining M + FM + m 
gives a higher reliability coefficient than any 
of its components and also provides a higher 
coefficient when related to the weighted color 
score in place of JZ, suggesting that this score 
deserves further investigation. None of the 
measures offers sufficient reliability for indi- 
vidual prediction. The question might be 
raised as to whether increasing the length of 
the test would make the reliability satisfac- 
tory. By the Spearman-Brown formula it is 
found that if length is the only limitation, 
utilizing 30 cards in place of 10 would raise 
a reliability coefficient of .50 to .75, which is, 
at best, a minimally acceptable figure for in- 
dividual prediction. Moreover, the time in- 


Table 1 


Reliability Coefficients Based on All Ten Sessions 
and on Session I and IT Only 


Score I.X I&I Mdn. 
wW 538 16 
w 33 04 
D 2 Ai 
dd Al .06 
CA 26 14 
AA 27 16 
St 56 45 
M 42 A9 
FM, 2 | 
me .23 82 
M+FM+m 53 61 
FC: 20 23 
(CF+C)t .33 60 
EC: 48 72 
M:(EC+M) 46 OL 
(M+FM-+m):(SC+M+9PM+m) .52 80 
F 41 59 
Form-level Al Gl 
H, 43 52 
A 50 48 
Anx 26.59 
Hst: 30 AL 


variance. For 
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volved in obtaining and scoring 90 responses 
would make such a test prohibitive for most 
purposes. 
In order to hold the influence of repeated 
testing to a minimum, the findings on only 
the first and second tests were considered. In 
Table 1 it can be seen that the magnitude of 
the reliability coefficients for sessions I and II 
ranges from .04 to .82. The median coefficient 
is .48. Fourteen of the 22 Coefficients are 
higher for sessions I and II than for sessions 
I through X, which hardly suggests a real 
difference, although what difference there is 
is in favor of the former. The few coefficients 
which are above .70 can best be accounted 
for by the decrease in stability of the coeffi- 
cients as a result of utilizing only 20% of the 
data. In accordance with this consideration, 
only about half the coefficients are significant. 


Test and Session Differences 


In order to obtain further information on 
whether repeated testing results in set and 
practice effects, reliability coefficients were 
computed on the data from sessions I and x 
and from sessions IX and X, In the former 
case, significant reliability coefficients at the 
-05 level were obtained for only two scores, 
m and C. In the case of sessions IX and 
X, ten coefficients were significant at the .05 
level as compared to twelve which were found 
for sessions I and IT. Apparently, individuals 
performed in a consistent manner to the last 
few as well as the first few tests, but were no 
longer doing the same thing at the end as they 
were at the beginning. This may mean that 
the scores investigated are significantly stable 
only over very short periods of time, or, more 
likely, that the experience of taking some ink- 
blot tests alters the reaction to further tests. 

Differences between sessions can be asso- 
ciated either with differences in the tests or in 
sequential position. Significant test-session dif- 
ferences were found for all seven location 
scores other than the space response. For no 

other score was significance approached, the 
majority of F values falling short of one. In 
order to determine whether the significance 
of the location scores was associated with, or- 
derly sequential changes, the mean number 
of responses for each of the significant scores 
was plotted as a function of the number of 
the session. No tendency was found for any 
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of the scores to consistently rise or fall. It 
thus appears that the differences associated 
with the testing sessions were probably a 
function of differences in the inkblot sets. It 
is interesting to consider, if this is the case, 
that the composition of the blots determined 
which areas were responded to but did not 
influence the determinants or content. The 
finding that there was no tendency for any of 
the scores to increase or decrease does not, of 
course, contradict the conclusion that indi- 
viduals developed sets, but indicates that the 
sets were not uniform. 


Discussion 


The finding that responses to the inkblots 
were primarily a function of the individuals 
responding to them rather than of the charac- 
teristics of the blots supports the basic theory 
underlying the use of inkblots as a Projective 
technique. Every score investigated, whether 
a standard Rorschach score or one developed 
for this study, measured individual differences 
to a significant degree, However, the degree 
of reliability found was not encouraging. 
Moreover, there was Teason to believe that 
adding more cards or requiring more responses 
Would not have raised the reliability of any of 
the measures to an acceptable degree unless 
the test was made so lengthy as to be im- 
Practical. Possibly this indicates that the ink- 
blot approach to measuring personality can 
never be sufficiently objectified and must re- 
main in the nature of an art. However, such 
a conclusion would be Premature for several 
reasons, 

For one, different scores and combinations 
of scores might have yielded different results. 
It may be possible to combine several scores 
relatively low in reliability into composite in- 
dices of higher reliability, In this connection 
it was found that the weighted sum of the 
color scores had a higher reliability than any 
of the individual color scores, and the same 
held for a combination movement score. How- 
ever, the M:(3C + M) ratio was not more 
reliable than its components. The Rorschach 
is a test in which patterns are considered more 
important than individual scores. This is 4 
reasonable claim, but if it is to be more than 
an escape behind the skirts of a mistaken 
notion of Gestalt psychology, it is necessary 
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to define the patterns. Once this is done their 
reliability can be assessed. 

Secondly, the low reliabilities may have 
been a function of the relatively homogeneous 
nature of the sample. The degree of sensitivity 
required of a measure is obviously a function 
of the magnitude of the deviations it must 
discriminate. Responses to inkblots could be 
of sufficient reliability to diagnose extremes 
of behavior without being adequate for dis- 
criminating among “normals.” In this respect, 
it is necessary to investigate reliability coeffi- 
Cients in different types of populations. 

Thirdly, the fact that the present set of 
inkblots failed to yield adequate reliability 
Coefficients does not mean that other inkblot 
tests would not give better results. Unfortu- 
nately, there is no satisfactory reliability 
study of the Rorschach. The split-half tech- 
nique used by a few investigators is not ap- 
Propriate (1). In a test-retest study by Eich- 
ler (2) on the Rorschach and Behn-Rorschach 
blots, total responses was not controlled, and 
may be presumed to have inflated the coeffi- 
cients. Nevertheless, he concluded that the 
reliability was inadequate for individual pre- 
diction. There is a need for experimental 
exploration to determine how inkblot tests 
should be constructed and administered in 
order to maximize their efficiency as measures 
of individual differences. It may be that the 
Rorschach Test will not be the final develop- 
ment in the use of inkblots as a measure of 

ersonality. ’ 
j A final possibility is that the epee 
Variables presumably measured by m p 3 
Sponses are themselves not highly stable. Ho 

if i ted that increas- 
Ever, if it can be demonstra’ ep A 
ing the length of the test, or Geet oe 
fying it, results in an increase in sal I ity, 
it would have to be conceded that it is i 
instrument and not the nature of peona H 
Which is determining the present as i. 
this connection it would be Aesiral oe = 
Vestigate reliability as a function 0 are 
tween tests to determine to what ex - 
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construct new inkblot sets, 
study. 


as in the present 


Summary 


Personality, whatever else it is, involves 
the concept of stable individual differences 
through time. The most fundamental require- 
ment of a test of personality, therefore, is that 
it demonstrate that it can measure such dif- 
ferences. In order to evaluate responses to ink- 
blots in this respect, 100 specially constructed 
inkblots were randomly assigned to ten sets 
of ten each. This was followed by intensive 
group testing of 16 individuals ten times over 
a five week period. In order to hold total 
number of responses constant, three responses 
per card were required. Several Rorschach 
scores plus some new scores were investigated. 
It was found that all scores measured indi- 
vidual differences significantly beyond chance, 
which was interpreted as supporting the basic 
theory underlying the use of inkblots as a 
projective technique. However, reliability co- 
efficients were below acceptable standards for 
individual prediction. This was true when the 
coefficients were derived only from the first 
two tests as well as from all ten. There is no 
evidence that Rorschach test-retest reliability 
coefficients would be more favorable if total 
number of responses and memory were con- 
trolled. The implications of the present find- 
ings and the need for further investigation 
of reliability of responses to inkblots were dis- 
cussed. 


Received October 2, 1956. 


References 


1. Cronbach, L. J. Statistical methods applied to 
Rorschach scores: a review. Psychol. Bull., 
1949, 46, 393-429. 

. Eichler, R. M. A comparison of the Rorschach 
and Behn-Rorschach Inkblot Tests. J. consult, 
Psychol., 1951, 15, 185-189, 

3. Elizur, A. Content analysis of the Rorschach with 
regard to anxiety and hostility. J. proj. Tech., 
1949, 13, 247-284. 

4. Harrower, M. R., & Steiner, M. E. Large scale 
Rorschach techniques. Springfield, IIl.: Charles 
C Thomas, 1945. 

5. Klopfer, B., & Kelley, D. M. The Rorschach tech- 
nique. Yonkers, N. Y.: World Book Co., 1946. 

6. Lindquist, E. F. Design and analysis of experi- 
ments in psychology and education. Boston: 
Houghton Mifflin, 1953. 


w 


Journal of Consulting Ps cholo; 
Vol. 21, No. 3, 1957 < j 


A Factorial Study of Behavioral and Psychological 
Measures of Anxiety” 


Harold Wilensky 


Franklin D. Roosevelt VA Hospital, Montrose, New York 


Anxiety questionnaires tend to show low to 
moderate correlations with ratings of observ- 
able behavior, and moderate to high correla- 
tions with other subjective measures or ques- 
tionnaires. The concept of anxiety, however, 
is so broad, including diverse psychological 
and physiological aspects, that the prediction 
of striking interrelationships seems unwar- 
ranted. 

The present study undertook the analysis 
of the interrelationships among ten variables 
associated with the concept of anxiety. The 
data were collected during an experimental 
evaluation of a tranquilizing drug. Sixty-six 
hospitalized male patients were observed by 
ward personnel during a ten-day premedica- 

tion period. Since a schizophrenic population 
was employed, three measures of severity of 
illness (degree of contact, consistency of re- 
sponse, and type of ward) were included. 
From a psychiatric interview, ratings of the 
S’s report of physical symptoms, observable 
signs of anxiety, and degree of contact were 
obtained. Ward personnel rated patients for 
the extent of disturbed behavior. A count of 
sleep disturbances, blood pressure, and pulse 
readings were made. An anxiety questionnaire 
yielded an anxiety score and the consistency 
of response score. Patients were also asked 
the direct question: “Do you consider your- 
self to be a tense or anxious person?” 


1 Based on a paper read at the meeting of the 
American Psychological Association, Chicago, 1956, 

? An extended report of this study may be ob- 
tained without charge from Harold Wilensky, VA 
Hospital, Montrose, N. Y., or for a fee from the 
American Documentation Institute. Order Document 
No. 5181, remitting $1.25 for microfilm or $1.25 for 
photocopies. 


Tetrachoric correlations were computed 
among the variables and the resulting matrix 
was factored by Thurstone’s complete cen- 
troid method. Two factors were extracted. 
The axes were rotated to oblique simple 
structure. The correlation between factors is 
— 34, 

Factor A loads highly subjective reports of 
anxiety—the anxiety scale, the admission of 
anxiety in response to the direct question, and 
the psychiatrist’s ratings of the subject’s re- 
port of psychological and physical symptoms. 
It tentatively may be identified as experienced 
anxiety. Patients who recognize and readily 
admit to feelings of anxiety do so on paper 
and pencil tasks and in interview situations. 
A moderate loading of the ward behavior 
variable and the low but significant loading 
in the sleep disturbance count suggest that 
the experienced anxiety is in part expressed 
in daily ward living. 

Factor B appears to be defined by the con- 
tact with reality variables and also higher 
blood pressure. The moderate loadings of the 
blood pressure readings on this factor are in 
accord with previous studies. Schizophrenics 
in remission tend to have higher blood pres- 
Sure than the more disturbed patients. The 
Psychiatric rating of observable signs of anx- 
iety loads negatively on Factor B. Many 
schizophrenic patients in poorer contact mani- 
fest behavior such as grimaces and manner- 
isms which tend to be rated as tension, al- 
though they do not admit verbally to feelings 
of anxiety. 


Brief Report. 
Received February 25, 1957. 
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Differences in Perceptual and Cognitive Behavior 
as a Function of Experience Type’ 


The proposition that differences in experi- 
ence type on the Rorschach reflect certain 
trends in personality functioning has been a 
Common assumption in clinical practice. In a 


Previous paper (3), it was pointed out that, 


One way of conceptualizing the experience 
type is in terms of the person’s use of in- 
ternal and external stimulus factors. The in- 

>  troversive person utilizes internal stimulus 
factors to a greater degree in that he invokes 
aspects of the stimulus situation more proxi- 
mal to himself, while the extratensive person 
Utilizes external stimulus factors in that he 
responds more to the physical, “out-there” 
qualities of the blot. The present research is 
designed to establish empirically differences 
in performance on a perceptual and cognitive 
task as a function of the experience type of 
the person. The perceptual problem given to 
our subjects (Ss) was the Gottschaldt Em- 
bedded Figures Test (EFT), as modified by 
Witkin (7). This test requires S to perceive 
a simple geometric figure embedded in a 
larger, more complex figure. In their research 
+» On field dependence and independence, Wit- 
kin and his associates have studied certain 
empirical relationships between the EFT and 
the Rorschach (8). They found that Ss with 
longer solution times on the EFT (field de- 
Pendent) had lower “coping” scores on the 
orschach, as defined for example r i 


Cr than FC responses. 
and C responses thi a e 


Perception of the simple figure (i l Swie 
Þendence), on the other, was associated wi } 
Siving more human movement responses an 


Predominantly FC responses. 
In that contin of the current research con- 


1 This study was facilitated by a 
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cerning the relationship of the Rorschach ex- 
perience type and the EFT, we assumed that 
introversive Ss on the Rorschach are express- 
ing more conceptual, abstract behavior. In a 
sense, they are not as dependent as extraten- 
sive Ss on the available external stimulus fac- 
tors, but utilize a more personalized and ab- 
stract problem-solving attitude. Therefore, on 
the EFT, we would expect this conceptual ap- 
proach of the introversives to be of benefit in 
finding the simple figure, inasmuch as this 
type of approach has been considered to fa- 
cilitate EFT solutions (7). The extratensives 
by analogous reasoning, we would expect te 
be more dependent on the external stimulus 
factors. To the extent the extratensive Ss de- 
pend upon the stimulus field of the Gottschaldt 
card, and do not invoke the more conceptual 
approach of the introversive Ss, we would ex- 
pect the former Ss to do more poorly on the 
EFT. 

Prediction 1. Ss with an introversive experi- 
ence type have shorter solution times on the 
EFT than do Ss with an extratensive experi- 
ence type. 

The cognitive task given to our Ss was a 
modification of the Role Construct Repertory 
Test developed by Kelly (6). In our use of 
this test, we were primarily interested in a 
measure of cognitive complexity. As reported 
elsewhere (4), cognitive complexity is defined 
as the ability to develop alternative interper- 
sonal perceptions from among a group of per- 
sons known to S. The S is asked to make cer- 
tain sortings (described below) using people 
as the objects to be sorted. These sortings are 
assumed to represent available perceptions S 

isni hers. The greater the num- 
yi Resepna licited from S on the sorts, 
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the more cognitively complex is he. The hu- 
man movement response presumably repre- 
sents an ability to go beyond what is im- 
mediately available in the stimulus. We should 
expect, therefore, that cognitive complexity, 
as a measure of response differentiation, 
should relate positively to the tendency to 
have an introversive experience type. This 
tendency should be further reinforced since 
both the cognitive complexity and human 
movement measures involve people as con- 
tent. . 

Prediction 2. Ss with an introversive ex- 
perience type have higher cognitive complex- 
ity scores than do Ss with an extratensive ex- 
perience type. 

Method 


Experimental Groups 

Due to the empirical nature of the research, 
it was decided to employ the method of cross 
validation in the experimental procedure. Ac- 
cordingly, after the initial group of Ss was 
run (hereafter referred to as Group 1), a 
second group of Ss (Group 2) was given the 
identical experimental battery. The Group 1 
Ss consisted of 39 female undergraduates from 
Radcliffe College. Group 2 Ss were 23 female 
summer school students who were primarily 
college undergraduates from schools in the 


eastern part of the country. All Ss were paid 
to serve in the research. 


Inkblot Test 


In place of the standard Rorschach test 
procedure, a special modification for obtain- 
ing inkblot reactions was used. This modifica- 
tion consists of using 16 Rorschach D’s which 
were selected on the basis of their judged 
ability to provide both internal and external 
stimulus factors. Such a selection facilitates 
obtaining an adequate number of human 
movement and color responses for measure- 
ment purposes. The number of blots used was 
increased to 16 from the previously used 10 
(3) in order to provide more variable stimu- 
lus material and to reduce the number of 
repetitions. A template placed over the entire 

card exposes only the desired blot portion. 
The S is presented with each blot following 
the usual Rorschach instructions. Only one 
response per blot is elicited. After S has gone 
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through the 16 blots, he is told he will see 
them again and is to tell the examiner what 
else it might be. This second trial is followed 
by the usual inquiry procedure. In this way, 
a total of 32 responses is obtained from each 
S. The following are Beck’s (2) notations of 
the specific blot areas which were used, in 
order of administration: 


Card I: entire center “figure” area (D 4). 

Card II: upper left red (D 2). 

Card IV: lower center area (D 1). 

Card III: upper left red inverted (D 2), 

Card V: entire right half without “antenna” (D 4). 

Card VIII: bottom center pink and orange area 
inverted (D 2). 

Card VI: entire left half of card without top pro- 
jection, rotated 90 degrees to right (D 4). 


Card IX: left green figure, card rotated 90 degrees 
to right (D 4). 

Card VII: entire right half of card (D 9). 

Card X: upper left blue arca (D 1). ‘ 

Card III: left human figure (D 1). 

Card IX: top right orange area (D 3), 

Card IV: lower projection of “heel” and “toe” of 
boot, rotated 90 degrees to the right (D 2). 

Card X: inner blue area (D 6). 


Card V: entire middle, including D 2 and D 3 
(D 7). 
Card III: middle red (D 3). 


Embedded Figures Test 


Because of time limitations in the experi- 
mental procedure, only eight of the 24 em- 
bedded figures used by Witkin were used in 
this study. While some unreliability is intro- 
duced because of the fewer figures, care was 
taken to select those drawings which would 
be representative of the various difficulty lev- 
els reported by Witkin (7). The following are 
Witkin’s notations for the figures used, in Or- 
der of administration: F-1; D-2; A-5; G-2; 
E-5; B-1; C-1; and A-2. The EFT was ad- 
ministered with the same instructions used bY 
Witkin (7). Each S's score on the EFT was 
the sum of the times taken to find the simple 
figures in all eight complex figures. Ss wet 
given two-minute time limits for the first 
seven figures, and a four-minute limit on thé 
final figure. If S had not located the simple 
figure when the time limit had elapsed, 
was given a score of either 120 or 240 secon > 
The time allowances were ample for the ™4 
jority of Ss to find the embedded figures: 
Group 1 Ss had a mean solution time E 
315.78 seconds, with a range of 80 to 8 
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seconds, while Group 2 Ss had a mean solu- 
tion time of 357.04 seconds, with a range of 
99 to 837 seconds. 


Cognitive C omplexity 


On this sorting task, S is asked to write the 
names of seven persons known personally to 
him who come closest to matching seven role 
descriptions. The following role descriptions 
Were used for the seven persons: 


- Yourself, 

» Your mother. 

- Your father. 

- Your sister closest to you in age, or the person 
most like a sister. 

S. Your current boy-friend, or a boy you know 
well. 

6. Your best friend (of your sex). 

7. Your ex-boy-friend, or a boy you know well. 


AUNG 


This list was designed to sample both 
family members and age-mates of the same 


` and opposite sex. The actual sorting consisted 


of placing three of the names at a time in 
front of S and asking her to consider how two 
Of the three were alike in some important per- 
sonal characteristic, and different from the 
third in this respect. The S was then asked to 
name the similarity and its opposite, even 
though the opposite did not apply directly to 
the third person. For example, the first sort 
given Ss contained the self, mother and father. 
One S sorted herself and her father ie 
as being “selfish,” the opposite of bone she 
Stated to be “generous.” Thus the ary en is 
Perceived as more generous than iie = er 
or S. Each S is given 25 sorts, prenraie® E 
that no three persons appear toget er mo 
than once, The score for cognitive Rp x 
's obtained by counting the pie bi at 
ferent verbal dimensions given by ‘ : ae 
5 sorts, A repetition is counted if ei ee 
Oth ends of a dimension are a y ss 
Peated on a subsequent sort. ‘Thus, ee ye 
Ceive a complexity score ranging Ae ease : 
Actually, the scores obtained for : ey 
Tanged from 6 to 25, with a ae o te 
While scores for Group 2 ranged from 1. 


5, with a mean of 19.48. 
Results and Discussion 


h S 
The inkblot responses for eac 3 
Scored without knowledge of S’s EFT or cog 


were 
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nitive complexity scores. A satisfactory inter- 
scorer reliability for the inkblot reactions had 
been obtained, as reported in a previous study 
(3). To obtain the experience type of each S, 
the total number of human movement (JZ) re- 
sponses and the Sum C score were calculated. 
The latter score was derived by weighting the 
various color responses (FC, CF, and C) in 
the usual manner. -It was decided to place 
Ss into one of two experience types, either 
introversive (M > Sum C) or extratensive 
(Sum C > M). This was done to enable the 
use of a larger number of Ss in each group 
than would be possible if other experience- 
type groupings were also used (such as co- 
arctated and ambiequal). In order to place 
each S in either the introversive or extraten- 
sive group, the original M and Sum C scores 
were converted to standard scores. Such a 
procedure, as Barron (1) and others have 
pointed out, eliminates the otherwise tenuous 
assumption that the blots have equal stimulus 
value in evoking movement and color re- 
sponses. Separate standard score distributions 
were calculated for Group 1 and Group 2. If 
S had a larger standard score for M than for 
Sum C, she was placed in the introversive 
category; if her Sum C standard score was 
larger than her standard score for M, she was 
placed in the extratensive category. One S in 
Group 1 had identical standard scores for M 
and Sum C, and was therefore excluded from 
the analyses for that group. In this way, for 
Group 1, 18 Ss were placed in the M > Sum 
C category, and 20 Ss fell into the Sum C > 
M category. Similarly, in Group 2, 11 Ss were 
in the M > Sum C category, and 12 Ss were 
in the Sum C > M category. 

Table 1 presents the mean number of re- 
sponses in the various determinant classes 
given by the Ss in the M > Sum C and Sum 
C > M categories of Groups 1 and 2. In ad- 
dition, for purposes of further evaluation, Ss 
from both groups were placed in a total dis- 
tribution, forming Group 1+ 2. This was 
done after it had been established that there 
were no significant over-all differences between 
Group 1 and Group 2 on the Rorschach, the 
EFT, and the cognitive complexity measures. 
New standard scores were derived to place Ss 
in the appropriate experience type for Group 1 
+ 2 in Table 1. Actually, all Ss remained in 
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Table 1 


Mean Rorschach Scores of the M>Sum C and Sum C>M Categories for the Experimental Groups 


Group 1 Group 2 Group 1+2 a 
M>SumC SumC>M M>SumC Sum C>M M>SumC Sum C>M 
Score (V=18) (VW =20) (V=11) (N=12) (N=29) (NV =33) 
M 9.3* 3.8 8.4* 3.8 9.3** 3.5 
FM 39 7 3.0 3.6 4.0 3.7 3.4 A 
m 1A 0.8 0.6 0.5 0.9 0.7 
F 15.1 16.1 16.4 14.3 15.1 166 
Sh 2.5 2.8 0.6 3.6* 1.9 2.9 
FC 1.3 2.4 1.4 2.6 LS 2.3 
CF 0.4 2.4* 0.9 3.0* 0.7 2.5% 
Sum C 1.1 3.8* 1.6 4.5* 15: 3.8* 


* Higher than mean score of other experience type category on this variable at or below .05 level of significance (two-tails) 
#* Higher than mean value of this score of other experience type category in this group at .01 level of significance (two-tails) J 


their original experience-type category except 
for the following changes: Two Ss who were 
originally extratensive in Group 1 became in- 
troversive in Group 1 + 2; two Ss who were 
originally introversive in Group 2 became ex- 
tratensive in Group 1+ 2; and the one S$ 
who had been ambiequal in Group 1 became 
extratensive in Group 1 + 2. 

For purposes of presentation, the few pure 
color (C) responses given among all the Ss 
were included in Table 1 in the CF determi- 
nant listing. Similarly, all the various Klopfer 
shading scores (FK, KF, k, Fc, etc.) were 
included in the Sk category. It can be ob- 
served from Table 1 that the introversive and 
extratensive Ss in Groups 1, 2, and 1+2 
differ from each other only in terms of the M, 


Table 2 


Mean Scores of the M>Sum C and Sum C>M Categories on EFT and Cognitive Complexity 
for the Experimental Groups 


CF, and Sum C scores. The sole exception is 
the significant difference in Group 2 between 
the experience types on the number of S4 re- 
sponses given. 

Table 2 presents the experimental findings 
for Groups 1, 2, and 1 + 2 relative to the two 
predictions presented earlier in the paper. 
White’s ranking method (5) was used to test 
the significance of the differences between 
means. Inspection of Table 2 indicates that £ 
the M > Sum C Ss in Group 1 had signifi- 
cantly longer solution times on the EFT than 
did the Sum C > M Ss, and that this differ- 
ence persisted in the cross validation involv- 
ing Group 2. When the Ss are combined into 
Group 1 + 2, we find that the difference be- 
tween the two experience types on EFT per- 


f 


EFT Cognitive complexity 

OENE Oe 

M>Sum C Sum C>M M>Sum C Sum C>M 
Gomi 363.70* 272.20 16.78 19.75* 
(N=16) (N=20) (N=18) (N=20) 
Group 2 : a 430.82* 289.92 18.36 20.50" 
(N=11) (N=12) (W=11) (N=12) 
Group 1+2 384.81** 286.88 17.62 19.94" 
(N=27) (N=33) (N=29) (V=33) 


* Higher than mean score of other experience 
#** Higher than mean score of other experience 


type category on this variable at or below .05 level of significance 
type category on this variable at or below .01 level of significance 


two-tails ): 
(vo-ta s 
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formance becomes even more significant (p 

> <.01). These findings are directly opposite 
to those anticipated in Prediction 1 of the 
study, 

r The findings relative to Prediction 2 are 
also to be found in Table 2. Here again, the 
results are in the opposite direction from the 
original prediction. In both Groups 1 and 2, 

za Ss with the Sum C > M experience type had 

Significantly higher cognitive complexity scores 

than did those Ss with the M > Sum C ex- 

Perience type. Again, this difference exists 

When the Ss are combined into Group 1 + 2. 

It is apparent that the findings ran con- 

Y trary to our expectations in both experimen- 

tal groups, and that the findings on the EFT 

run counter in some degree to the previous 
work of Witkin (8). We wish to comment 
briefly upon one factor which appears to have 

Contributed to our unexpected results, that is, 

Our use of female Ss. It is likely that in mak- 

ing our original predictions we were not suffi- 

Ciently aware of the possible importance of 

Sex differences in performance on the kinds of 

Perceptual and cognitive tasks we employed. 

Witkin has discussed this problem at some 

length (7, 8), and generally finds women less 

effective on the EFT and more variable in 

Performance. It is interesting to note that the 

FT performance of women is ge to 

Correlate less highly with Rorschach in ices 

than in the case of men. In a previous a! y 

(4), one of the present writers found that = 

Males, i production correlated postie 7 

With cognitive complexity scores. Hon z 

rently in progress is designed to invests: 

ese problems in more detail. 


wii 
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Summary 


This study was designed to me e 
“rences in performance on a perceptu Ence 
“ognitive task as a function, ese Ss 
tYpe, It was predicted that mior N P 

‘ Would be able to perceive the simple E aa 
the embedded figures test more quickly 
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extratensive Ss. F urther, introversive Ss were 
predicted to produce a higher score on a cog- 
nitive complexity test which involved an ob- 
ject-sorting task using people as the objects 
to be sorted. The inkblot method used was a 
modification of the regular Rorschach blots in 
which only Rorschach Ds were used. M and 
Sum C scores were converted into standard 
scores in order to place each S in either the 
introversive or extratensive experience-type 
category. Two groups of women college stu- 
dents were used as experimental Ss. The re- 
sults from both groups were in the direction 
opposite from that originally predicted. Ex- 
tratensive Ss perceived the embedded figures 
significantly faster than introversive Ss, and 
the former Ss had significantly higher cog- 
nitive complexity scores in their perception of 
people than did the latter group of Ss. The 
relationship of this study with previous re- 
search is discussed, and the importance of the 
role of sex differences in perceptual perform- 
ance is suggested. 


Received August 27, 1956. 
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An Item Analysis of the Coloured Progressive 
Matrices (1947) 


Thomas E. Jordan and Carson M. Bennett 
Ball State Teachers College, Muncie, Indiana 


The Coloured Progressive Matrices (2) is 
a series of 36 designs, each of which is in- 
complete. The subject indicates which of a 
series of possible inserts he believes will com- 
plete the design. The scale is intended to 


Table 1 


Index of Discrimination and Level of 
Difficulty of Items 


Level Index Level Index 
of of of of 
Item diff. discrim. Item diff. discrim. 
A1 96% 31 Ab7 24% 49 
A2 91 22 Ab8 17 .24 
A3 90 20 Ab9 21 .24 
A4 88 37 Abi0 23 .29 
A5 74 03 Abii 22 28 
A6 76 50 Abi2 10 .02 
AT 33 37 B1 92 16 
AS 49 28 B2 71 59 
AQ 36 52 B3 42 .70 
A10 29 46 B4 50 54 
A11 10 06 BS 22 46 
A12 24 04 BO 28 35 
Abi 88 31 B7 17 15 
Ab2 85 AL B8 4 43 
Ab3 70 45 B9 11 —.02 
Ab4 26 43 B10 16 13 
Ab5 28 39 Bil 9 05 
Ab6 24 53 B12 5 .02 


measure Spearman’s g factor and emphasizes 
the eduction of relationships and correlates. 
The significance of this interesting instru- 
ment is still in doubt since no item analysis 
has been made. As a result of a continuing 


interest on the part of the senior author in 
the Coloured Progressive Matrices as a test 
of intellectual ability for multihandicapped 
children, the authors have undertaken to cor- 
rect this deficiency. 

The Coloured Progressive Matrices was ad- 
ministered to two hundred children who were 
entering first grade and who had no known 
sensory-motor handicaps. The indices of dis- 
crimination were computed using the conven- 
tional upper and lower 27 per cent technique. 
The item difficulties were determined by 47 
item count of correct responses. 

The data in Table 1 can be interpreted by 
the application of criteria concerning the qual- 
ity of test items. One such set of criteria has 
been supplied by Ebel (1, pp. 143-152); who 
suggested that an index of discrimination 0 
20 and above is satisfactory and that the 
range 40 to 70 per cent be used as a criterion 
for level of difficulty. When these criteria are 
applied, 25 of the 36 items are satisfactorily 
discriminative, but only 4 fall in the suggested 
difficulty range. Twenty-two items appear to 
be too difficult for the age group studied. The 
data suggest that the test is of less value for 
lower age groups. 


Received January 21, 1957. 
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Diagnostic Prediction From Emphasis on the Eye 


and the Ear in Human Figure Drawings 


J Ronald I. Ribler* 
Michigan State University 


In 1949, Karen Machover (1) offered sug- 
gestions on the significance of variability in 
, human figure drawings pertinent to person- 
ality. She reported finding that certain as- 
pects of the personality of the individual were 
frequently shown in features of his drawings 
and that specific personality types would be 
More prone to present given traits in their 
drawings which would enable the clinician to 
make diagnostic differentiation. Among these 
traits were emphasis of eyes and ears, which 
Machover reported, would be found most fre- 
, quently in paranoid schizophrenics (1, pp. 
% 48-50). The purpose of this study was to 

test the following hypothesis which is implicit 

. in Machover’s monograph, as follows: para- 

Noid schizophrenics, when asked to gew a 

human figure, produce drawings wit i a 

Breater proportion of eye and/or ear empha- 
sis than other diagnostic groups. i 

Since Machover (1) did not define “em- 


Phasis,” the following definition was Hee! 
! ; ua i 
e purposes of this study: (2) Pons PRA 


" Ment, (b) disproportion of eye W i 
to total hoe (c) differential line vane 
of eves terns of shading, etc., in con 


cessive detail 
to the rest of the figure), (d) ce marked 


eye parts and/or eyebrows, s 
Staring quality. Ear emphasis iS oe 
AY one or more of the following: (o) me 

i ae Placement, (b) disp dete sad 
e relation to total figure, (c) differ sure 
mep Wality (of ear in terms Of PAP he 

“ading, etc., in contrast to the rest 0 

: tion 

a This study was made possible by the coon 

the staff of the Veterans’ Administrat hert of 

Battle Creek, Michigan, and G. M. tude E 

igan State University, to whom 8"! 
© expressed, 


bital, 


figure), (d) excessive detail of ear parts, (e) 
marked “listening” quality, (f) ear adorn- 
ments (earrings, etc.). 


Subjects and Procedure 


Subjects. The sample consisted of 120 
males, patients and attendants in a Veterans’ 
Administration neuropsychiatric hospital, di- 
vided into four groups: Group I, 30 patients 
diagnosed paranoid schizophrenia (mean age 
30, mean IQ 100.7). Group II, 41 patients 
diagnosed unclassified schizophrenia (mean 
age 28, mean IQ 100.4). Group III, 16 pa- 
tients diagnosed anxiety neurosis (mean age 
35, mean IQ 107.4). Group IV, 33 hospital 
attendants (“normal” controls, mean age 29, 
mean IQ 85.3). Henmon-Nelson IQ was used 
on the attendant group. Groups I, II, and III 
were chosen as representing a typical sample 
of diagnostic cases seen in neuropsychiatric 
hospitals. In these three groups the Draw-A- 
Person Test was administered as part of the 
regular test battery under standard testing 
conditions (1, pp. 28-29). Group IV was 
tested in a classroom situation, also under 
standard testing procedures (1, p. 105). Two 
drawings were obtained from each subject in 
all of the groups. 

Judgment of drawings. After the drawings 
were collected, the identifying marks were ob- 
literated. The drawings were then numbered, 
randomized, and listed by number on the 
judges’ rating sheets. The age and race of 
the subjects were also listed on the judges’ 
sheets. The procedure for judging was stand- 
ard, the drawings being presented to four 
o determine the presence or absence 


judges t 
Hoe bles of eye and/or ear emphasis. 


of the varia 
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Table 1 


Reliability of Judges as Determined by Hoyt’s 
Analysis of Variance Technique 


Ronald I. Ribler 


Category Reliability df 
Eye-Drawing 1 .65* 357 
Ear-Drawing 1 .67* 357 
Eye-Drawing 2 .13* 357 
Ear-Drawing 2 FY 357 


Note.—Mean reliability is .70. 
* Significant beyond .01. 


Three staff psychologists and one advanced 
graduate student in clinical psychology were 
asked to rate emphasis in the case of each 
drawing. Each judge was supplied with in- 
structions as follows: 

These drawings were gathered from four diagnostic 
groups; “normals” (attendants), anxiety reactions, 
paranoid schizophrenics, and unclassified schizophren- 
ics. The drawings are numbered and arranged in 
serial order from 1 to 120, with the diagnostic groups 
randomly represented. The enclosed sheet has spaces 
numbered from 1 to 120 corresponding to the draw- 
ings. In the appropriate spaces, indicate for each 
drawing in the pair the presence (+) or absence (0) 
of eye emphasis according to the criteria listed be- 
low. Also list the number(s) corresponding to the 
criteria which determined your decision in each case. 


The judges were not aware of the order of 
presentation nor of how many subjects repre- 
sented each diagnostic group. Each judge 
worked independently of the others. 

When the judging was completed, the judg- 
ing sheets were scored. Each indication of 
judged emphasis per judge was scored as one 
point. Since there were two drawings from 
each subject and a judgment of eye and ear 
emphasis on each by the four judges, the 
maximum number of points possible for a 
given subject was sixteen. 

Reliability of judgments. The reliability 
among the judges was determined by two 
methods; percentage of agreement among the 

judges (see Table 2), and Hoyt’s (2) analy- 
sis of variance technique for determining reli- 
ability (see Table 1). 

Finally the sums (total number of judged 
emphases) were tallied and a 2% 2 ‘chi 
square table set up to determine whether 
there were significant differences between the 
groups. The data were dichotomized (elimi- 
nating 22 subjects in the mid-range; N-98), 


Table 2 . 
Frequency Table of Percentage of Agreement ; 
Among the Judges 
Percentage of 
i 


agreement Frequency 
(%) ) (%) Q) 
100 19 1900 
87 25 2175 
83 13 1079 
75 20 1500 
71 6 426 
67 6 402 
63 8 504 
58 7 406 ? 
54 5 270 
50 4 200 
46 S 230 
42 2 84 
120 9176 | 
Note.—Mean percentage of agreement is 76.46, t 


using 0—4 points as indicative of no emphasis 
and 8-16 points as indicative of emphasis. 


Results and Conclusions 


An analysis of the data by means of chi 
square (see Table 3) indicates that there were 
no statistically significant differences between 
the diagnostic groups on the variables of eye 
and/or ear emphasis. 

Table 1 indicates that there is a rather high 
degree of agreement among the judges and 
adequate reliability between the judges. With 
the reliability established, and with the sig- 
nificance level of the chi-square analysis, it 
appears that the hypothesis that paranoid 
schizophrenics will produce greater eye and/or 
ear emphasis on human figure drawings is not | 


-_— t 


supported by the data. 


Table 3 


Chi-Square Analysis: Judgments of Eye 
and Ear Emphasis 


No emphasis Emphasis = 
Group (0-4 pts.) (8-16 pts.) Sum 
Paranoids 21 7 a | 
Nonparanoids 56 14 i 
Sums 77 21 9 


Chi square = .296; df = 2; p = .90—.80. 


Diagnostic Prediction from Figure Drawings 


Summary 


One hundred and twenty pairs of Draw-A- 
Person tests were selected in a Veterans’ 
Administration’ neuropsychiatric hospital to 
determine whether judged eye and/or ear 
emphasis could differentiate paranoid schizo- 
Phrenics from unclassified schizophrenics, anx- 
iety neurotics, and “normals.” Scoring vari- 
ables were selected and the protocols were 
submitted to four judges independently of 
each other. The reliability among the judges 
Was found to be adequate, as was their agree- 
ment. The variables of emphasis, however, 
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Proved to be Statistically not significant, with 
reference to diagnosis, suggesting that further 
work is necessary before these variables may 
be used with any degree of confidence by the 
practicing clinician. 
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Reliability and Validity of the n-Achievement Test 


John D. Krumboltz 


Air Force Personnel and Training Research Center? 


and William W. Farquhar 


Michigan State University 


The n Achievement (6) is a projective test 
designed to measure an individual’s need to 
achieve, his ambition, his drive, his motiva- 
tion, or in other words his desire to succeed 
in competition with some standard of excel- 
lence. In response to picture stimuli the sub- 
ject writes stories which are scored for mo- 
tivational content according to a well-defined 
set of instructions (6, pp. 107-138). 

Much evidence on concurrent and construct 
validity and on interscorer reliability has been 
published (2, 6, 7, 8, 11). However, less evi- 
dence in the way of predictive validity (9, 
10) and test-retest reliability has been re- 
ported. 


Procedure 


As part of an investigation on the effect of 
different teaching methods on outcomes in a 
How To Study course (3, 5), the n Achieve- 
ment was administered to students at the be- 
ginning and again at the end of this one quar- 
ter course at the University of Minnesota. 
Pictures A, B, D, and E (6, p. 375) of the 
n Achievement were administered both times 
in that order. One psychologist scored all 
stories. His scoring of 80 stories by 20 indi- 
viduals (6, pp. 346-374) correlated .81 with 
the scoring of these same stories by the au- 
thors of the scoring system. Furthermore, on 
another sample of 80 stories from the present 


sample scored and rescored after an interval 


1 This research was accomplished while both au- 
thors were at the University of Minnesota. The opin- 
jons and conclusions expressed herein are those of 
the authors. They are not to be construed as neces- 

or indorsement of the Air 


sarily reflecting the views 
Force or of the Air Research and Development Com- 


mand. 


of a month, a score-rescore reliability of .91 
was obtained by this same psychologist. 

Three other tests which may be considered 
as tentative criterion instruments were also 
administered as pre- and posttests: the Sur- 
vey of Study Habits and Attitudes (SSHA), 
the Opinion Attitude and Interest Survey 
(OAIS), and a 99-item objective achievement 
examination (HTS) based on the content of 
the How To Study course. Both the SSHA 
and the OAIS may be considered as possible 
measures of achievement motivation. The 
SSHA consists of items about attitudes to- 
ward study and study habits and discrimi- 
nates between students with high and low 
grade-point averages (1). The OAIS is a con- 
figurally scored personality inventory which 
has been shown to add unique variance to the 
prediction of honor-point ratio at the Univer- 
sity of Minnesota (4). Both tests identify 
personality type variables differentiating over 
achievers from underachievers (and presum- 
ably the highly motivated from the less highly 
motivated). The HTS achievement examina- 
tion provides a basis for evaluating actual 
achievement, although its internal consistency 
reliability is not as high as desired (r = -68 
by Hoyt’s analysis of variance technique). To 
determine how independent n Achievement 15 
from scholastic aptitude, scores on the Amer 
can Council on Education Psychological Exs- 
amination (ACE), obtained from records © 
the Student Counseling Bureau, were corre- 
lated with it. 

The design of the experiment called for StU 
dents to be randomly assigned to the different 
teaching methods. However, because of sched- 


uling difficulties it was not possible to assig? 
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Table 1 
The Correlations of n Achievement With Itself and Other Variables 
Random Nonrandom Total Total Total 
Males Males Males Females Males and Females 
N = 80 N=50 N = 130 N= 39 V = 169 
nAch nAch nAch nAch nAch nAch ndAch nAch nAch nAch 
(Pre) (Post) (Pre) (Post) (Pre) (Post) (Pre) (Post) (Pre) (Post) Mean SD 
n Ach (Pre) 52 4.6 
n Ach (Post) .49** 02 ie 25 .26** 6.0 44 
HTS (Pre) A540 —.03  .02 09 407: .22 —.07 13 (4.03 585 7.8 
HTS (Post) 22* —.03 10.03 18* .00  —.03 —.12 12 02 713) 77 
SSHA (Pre) —06 02 —07 18  —.07 .08 12 —28  -03 —.01 296 108 
SSHA (Post) —.05 13 o2 31* —.02 .20* 17 —.09 02 12 305 119 
OAIS (Pre) "26" 18 Eer =i .08 04 17 —.01 10 03 41.4 10.7 
OATS (Post) 20 U REE 05.12 .09 —.05 06 08 424 115 
ACE wo n Sas 02 09 Avy S 05 107 101.0 19.7 


* Significantly dit fi ero at the .05 level. 

s Significantly diferent from zero at the 01 level. 
all students randomly. Those who could not 
adjust their schedules to the random assign- 
ment were designated as nonrandom students. 


Results 


The concurrent and predictive validities of 
the n Achievement are reported in Table 1 
along with test-retest reliabilities. e 

Tt will be observed that the test-retest reli- 
ability with a nine-week interval is only 26 
for the total group. This correlation, although 
Significant at the .01 level, is only minimal. 

n Achievement measures the motivation it 
Purports to measure, then that level of mo- 
tivation must be a rather unstable quantity. 

May be that an individual’s “true” level of 

Otivation actually does fluctuate widely. In 

e light of what is known about other trait 

€asures and on a priori grounds this seems 
Unlikely, but evidence on this point is pres- 
“ntly unobtainable. Nevertheless, if n Aie 

ent is to have any value for Jongitudina 
Prediction, it must show more consistency 

an it has so far. 
p come slight evidence of validity could be 
indicated by ine pee of n Achievement 

the HTS posttest. For the total mf 

Soup the n Achievement pretest correla z 
ne With the HTS posttest. However, pas 
ae ‘evement posttest, administered at ao 
tee imately the same time as the HTS P 

St, correlated .00 with it. 


A few of the correlations with OAIS and 
SSHA are significant at the five per cent level, 
However, it must be remembered that when 
a large number of correlations are computed, 
a certain number may be “significant” by 
chance alone. Since the correlations are small 
and not consistent among groups, random 
variation seems the most logical explanation 
of these relationships. Similarly, the correla- 
tions with ACE show no consistent trends, 
For the total group they are nonsignificant 
and low. It would appear that for this sam- 
ple n Achievement is independent of scho- 


lastic aptitude. 
Summary 


The test-retest reliability of n Achievement 
(a measure of achievement motivation) and 
its relationship to certain other selected vari- 
ables has been computed on 169 students in 
How To Study classes at the University of 
Minnesota. A test-retest reliability of .26 
after a nine-week interval casts doubt on the 
stability as well as the possible validity of the 
measure. The correlations of n Achievement 
with an achievement examination in the course 
and with two other personality measures which 
correlate with academic success show no con- 
sistently positive or negative relationships. It 
is also independent of scholastic aptitude as 


measured by the ACE. 
Received August 16, 1956. 
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Level of Aspiration in a Group of 
Peptic Ulcer Patients’ 


Irving Raifman 
United States Naval Hospital, National Naval Medical Center, Bethesda, Md. 


_ Peptic ulcer is now commonly, even jok- 
ingly, regarded as an afiliction that is pe- 
Culiar to successful, self-driving businessmen. 
For years the professional literature on the 
Subject has described those suffering from 
Peptic ulcers as striving, overactive, ambitious, 
and efficient people. In 1934 Alexander and 
others (1, 2) postulated that one could dis- 
Cover in these patients a “typical conflict” 
Consisting of “intense receptive and acquisi- 
tive wishes against which the patient fights 
internally because they are connected with a 
Sense of inferiority” (2, p. 127). A number of 
Subsequent studies have provided support for 
this theory, 

This is a report of a study comparing the 
80al-setting and aspirational behavior of some 
Peptic ulcer Patients with that of a group of 
normal, and that of a group of psychoneurotic 
Subjects (Ss). The Rotter Level of Aspiration 

est (3, 4, 5) was selected for use, not only 

Ccause it has been shown to be reliable, but 
mea because the task involved is both re 
taining and ostensibly a matter of motor skill. 
~tter’s instructions were modified by omit- 
ting the Penalties for failures, in order to en- 

| This Paper is based on a doctoral aoa 
vomited to the department of psychology, aes 
Ork University in 1951. The study was dien i 
ber meeting of the APA in New York, Sep m 
aaia. Grateful acknowledgment is diie KED 
Hee Tomlinson and other members of he d per 
E Dy: committee, Dr. Robert Morrow, $ i yeu 
y osist at the Bronx VA Hospital, park ae 
cho} » and Miss Elizabeth Broomihiead, er) we 
tional at the United States Naval sot o 
thej; Naval Medical Center, Bet. ed TE rA 
cone) elp, guidance, and cooperation. Cod Hae 
flecy Sions are those of the author an a Navy 

Hes View or endorsement of the VA oF 

partment, 


courage as much expression as Possible of sub- 
jective, implicit ambition, 


Method 
Subjects and Procedure 


The experimental group was composed of 
15 patients with peptic ulcers who were hos- 
pitalized at the Bronx VA Hospital, New 
York. All were white male Ss with the ability 
to read and write English. Their ages ranged 
from 22 to 45, with a mean age of 31.80 years, 
The mean educational level was 10.53 years, 

One control group consisted of 15 men se- 
lected from the neuropsychiatric wards of the 
hospital and all diagnosed as “psychoneu- 
rotic.” The mean age of this group was 30.73 
and the educational achievement level was 
11.47 years. 

The other control group was selected from 
the general, medical, and surgical wards of 
the hospital. Their illnesses were not serious 
and not of a psychosomatic nature. This group 
was considered normal after each S had satis- 
fied the criteria established in the Cornell 
Selectee Index (6). The mean age was 30,87 
and the educational achievement level was 
11.77 years. 

There was no characteristic difference among 
the groups with respect to age or education 
nor was there any peculiar clustering of their 
occupational levels. Twenty of the 45 Ss were 
in the clerical sales group, and 6 in the 
professional, semiprofessional and managerial 
groups. Of these 6, only one was an ulcer 
patient. 

Each man made a total of 55 shots on the 
pinball-like Rotter Board. The first five shots 
were for practice. The remaining shots were 
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divided into ten trials (five shots per trial), 
and before each trial the S was told the score 
he had just made and asked to predict what 
score he would be able to achieve on the next 
series. Using the figures thus obtained—the 
actual scores earned and the “bets” made— 
four final scores were computed in every case. 

The D score resulted from averaging the 
difference between the performance just made 
and the estimate that followed it. When the 
S’s bet was higher than the score he had just 
earned, his D score was positive; when he 
predicted that he would not do as well the 
next time, his D score was negative. This score 
reflects the discrepancy between aim and ac- 
complishment, with a high, positive D score 
indicating that the S was relatively tenacious 
in his optimism. 

The A score is the average of the difference 
between the S’s bet and the following per- 
formance. If his prediction was lower than 
the score he then achieved, his A score was 
positive; if his estimate was too high, his A 
score was negative. This score, then, reflects 
the discrepancy between the level of the goal 
set and the actual attainment. A negative A 
score implies failure to reach the desired goal. 

The J score is the average of the difference 
between the S’s score made during the prac- 
tice period and the score he estimated he 
would be able to make the next time. It is as- 
sumed that the more positive the J score, the 
greater is the tendency to set unrealistic goals 
when starting something new. 

The C score represents the change between 
the S’s first five and his last five D scores. A 
positive C score indicates that he kept im- 


Irving Raifman 


proving and also expected more of himself, 
while a negative C score reflects his willing- 
ness to lower his level of aspiration when 
faced with frustration and failure. 


Results and Discussion 


The figures in Table 1 point to the ulcer 
group as more striving than the normal or the 
psychoneurotic group. The C scores are not 
significantly different, but the other three in- 
dicate that there were significant differences 
between the patients with ulcers and the other 
patients. One might well conclude that per- 
sons suffering from peptic ulcers have diffi- 
culty curbing their aspirations, that they are 
less able than others to attain their goals be- 
cause they lack an appreciation of the prob- 
lems and limits involved, and that they are 
willing to gamble for high stakes without be- 
ing familiar with the task they are about to 
undertake. At any rate, this study, which is a 
part of a wider investigation of Alexander’s 
theory of the “typical conflict situation” of 
peptic ulcer patients, adds to the evidence 
that such persons do indeed tend to set them- 
selves impossibly high goals, and that they 
are different from normal and psychoneurotic 
groups on this score. 

d All the scores of the normal Ss are con- 
sistent with what one would expect of them. 
In comparison with the ulcer patients, their 
needs are more realistic and it follows that 
they should attain more. They approach new 
tasks with relative caution, and as they be- 
come familiar with them, they move forward 
with the hope of doing better. 


Table 1 


Comparison of the Mean Scores on the Level of Aspiration Test for Ulcer Patients and Control Groups 


Ulcer (U) Normal (N) 


Neurotic (PN) t values 
Score Mean SD Mean SD Mean SD U-N U-PN N-PN 
D score 2.83 1.24 1.61 1.40 167 1.53 2.44* 2.23" 11 
A score —2.78 1.45 139 1.55 =1.27 1.24 2.44" 2,96" .23 
I score 5.87 5.17 87 430 2.53 4.78 2.78% 1.78 97 
C score —7.40 11.88 3.27 15.64 — 67 13.05 203 136 -72 


* Significant at .05 level. 
%* Significant at .01 level. 


Level of Aspiration in Peptic Ulcer Patients 


Summary 


The “typical conflict situation” in peptic 
ulcer patients which stimulates them to as- 
Pire beyond their level of achievement sug- 
gests that ulcer patients differ from other pa- 
tients in goal-setting behavior. 

Fifteen veteran peptic ulcer patients were 
compared with a like number of normals and 
fifteen psychoneurotic ‘patients on four meas- 
ures of their performance on the Rotter Level 
of Aspiration Board. The ulcer patients were 
Significantly higher in their aspirations and 
lower in their attainment than either of the 
two control groups, and more inclined than 
the normal subjects to overestimate their 
ability at the beginning of the problem. All 
of these differences appear to indicate that 
ulcer patients are an ambitious lot who can- 
not achieve their aspirations because they set 
goals which to others seem a erg 

The results support the impressions o 


Se ge e 


231 


Alexander and others with regard to the as- 
Ppirational drives of peptic ulcer patients, 
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Identification with Photographs of People’ 
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Several attempts have been made to deter- 
mine attitudes and personality factors through 
identification with photographs of people (1, 
2, 4, 5, 6). In the Szondi test, for example, 
patterns of likes and dislikes for photographs 
of different types of mental patients are as- 
sumed to reflect a subject’s own personality 
traits (3). 

The rationale of the identification hypothe- 
sis seems to be expressed by the saying “we 
like those who have the same traits as our- 
selves,” or, from an opposing point of view, 
“opposites attract.” The purpose of the pres- 
ent research was to test these alternative 
identification hypotheses. 

The possibility that a person may identify 
positively with photographs of people of the 
same sex but negatively with those of the op- 
posite sex was considered. The research re- 
ported here was limited to a study of identifi- 
cation with pictures of those of the same sex 
as the subjects (Ss). 

The trait of self-assertiveness was chosen to 
test the identification hypotheses. The term 
“self-assertive” appeared to be understand- 
able to the college students who were to serve 
as Ss and it could be rather objectively re- 
lated to overt behavior. Self-assertiveness was 
considered a trait of major importance in that 
it was defined similarly to traits such as domi- 
nance, ascendancy, aggressiveness, etc., which 
generally appear in factor analyses of person- 


ality measurement. 


Method 


Subjects 

A class of 33 junior and senior undergradu- 
ates in a course in Psychology of Personality 
at Muskingum College served as Ss. There 


1 This research was done at Muskingum College. 


were 15 men and 18 women in the class. It 
was felt that in the small liberal arts college 
where the research was conducted the students 
were well acquainted with each other and had 
a sufficient background in psychology to make 
fairly valid ratings and self-assessments. The 
Ss were naive with regard to the specific na- 
ture of the research and the connection be- 
tween the various tests. All of the tests were 
presented as part of a battery of tests and 
evaluations included in the course. The stu- 
dents were homogeneous with respect to age, 
intelligence, and socioeconomic background 
due to the selection involved in admission to 
the .college and the attrition resulting from 
prerequisites for the course. 


Tests 


Picture identification test. Previous experi- 
menters have usually classified photographs 
on some a priori basis (e.g., diagnosis of a pa- 
tient) and then assumed the photograph had 
an equal stimulus value for all Ss. In order to 
Correct for the possible weakness of this as- 
sumption, it was decided to permit each S to 
make his own selections of photographs he 
judged represented people high and low on 
the trait of self-assertiveness, This procedure 
allowed for individual differences and, at the 
same time, provided a direct determination of 
the stimulus value of a photograph for the 
individual. 

Pictures of college students, unfamiliar to 
the Ss, were selected from a college annual 
for the test. The pictures were of the same 
size and photographic quality and all of the 
students had dressed uniformly for the PIC 
tures. Girls wore dark sweaters and white 
beads and boys wore white shirts and dark 
ties. All pictures included the shoulders an 
face. The pictures were spaced in a square 
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Identification with Photographs of People 


Pattern in groups of four on white 6-in. by 
8-in. cards. There were 15 groups of boys’ 
Pictures for the male series and 15 groups of 
Sirls’ pictures for the female series. Fach card 
in the series was designated alphabetically and 
the pictures were numbered for identification 
Purposes. Men received the male series and 
women received the female series. The Ss were 
given a form with written instructions to se- 
lect from each group of four pictures the per- 
Son they judged as the most self-assertive and 
the person they judged to be the least self- 
assertive. Self-assertiveness was defined as a 
“tendency to have definite opinions on sub- 
Jects and to state them; a tendency to domi- 
nate or influence associates; a tendency for 
© Person to have his (or her) own way and 
to not back down in a conflict.” 
When the judgments were completed, each 
received a second form with written instruc? 
tions to select from each group of four pic- 
tures the person they felt they would like a2 
best and the person they felt they would like 
fast “as a friend.” In scoring the test a + 
Was given for each instance where there was a 
irect relation between the self-assertive J ds: 
ments and the affective reactions (a mel 
Self-assertiye” selection paired with a oo 
Most”? Selection or a “least self-assertive be 
a “liked least” selection). A — was score 
When there was an inverse relation pma 
‘e selections (a “most self-assertive aaa 
With a “liked least” selection or vice wi 
he score for the test was the sum of fis 
“uses subtracted from the sum a, n 
Pluses, The possible score range was Ton 


~ 80 to + 30 points. 


Rating scale, A six-point rating a si 
the trait of self-assertiveness was devise he 
Ploying the Q-sort technique. Each $ ba, 
tucted to rate every other S, ager) with 
the Wasi-normal distribution provide rtive- 
he Scale. The same definition of self-asse tion 
Provided for the picture pee 
est was supplied with the rating scale. ae 

received a self-assertive rating score W reed 
€ average of all the ratings he recel 


Six-point scale. 


j IN 
te Guily ord-Martin GAMIN test. The ape FE 
Was selected as providing an “a 


On the 
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ancy” scale (A scale) and a “lack of inferi- 
ority” scale (I scale). It was felt that both of 
these scales might be related to the trait of 
self-assertiveness as previously defined, The 
test was administered and scored according to 
the standardized procedures. Raw scores for 
the A and Z scales were used for purposes of 
analysis. 


Results 


The results were withheld from the Ss until 
all of the tests had been completed. Upon 
questioning at this time, only four Ss seemed 
to have recognized the Possibility of correlat- 
ing the two parts of the picture identification 
test. The majority of the students expressed 
the feeling that there would only be a chance 
association between their judgment choices 
and their affective reactions, Thus, it could 
be reasonably assumed that any identification 
which took place was primarily at a noncon- 
scious level. This was not true for the GAMIN 
where the majority of the students felt they 
could recognize the significance of many items 
and, of course, the purpose of the rating scale 
was self-evident. 

As there were no significant differences be- 
tween the sexes on any of the measures the 
Ss were pooled into one group for correlation 
purposes. Pearson 7 was used except for the 
correlation between the rating scale and the 
picture identification test where a rank-order 
correlation was employed due to the restric- 
tion on variation of the rating scale scores. 

All of the correlations are ina direction to 
support a positive identification hypothesis, 
The Ss who preferred photographs of people 
of their own sex whom they had previously 
judged to be high on self-assertiveness tended 
J ERA high scores on measures of ascend- 
sae lack of inferiority, and self-assertive- 
ness. The Ss who least preferred photographs 
of people they judged to be high on self- 

ertiveness, tended to have relatively low 
ass on measures of ascendancy, lack of in- 
SER and self-assertiveness. None of the 
R; o between the picture identification 
a the trait measures were high. The 
pie he n with lack of inferiority was + 52; 
corre a ot at the .01 level, the correlation with 
cemi was + .42, significant at the .05 
ae E rank-order correlation with the 
evel, 


234 4 
rating scale measure of self-assertiveness was 
+ .32, significant at only the .10 level. 

Further analysis yielded a Pearson 7 of 
+ 48 between the ascendancy and lack of 
inferiority scales of the GAMIN. This was 
not surprising in view of the fact that both 
scales correlated significantly with the pic- 
ture identification test. The results suggest 
that, for the Ss used in this research, the A 
and I scales of the GAMIN are not inde- 
pendent. 


Discussion 


Support for the positive identification hy- 
pothesis, obtained from this study, suggests 
other applications of a picture identifica- 
tion technique to personality measurement. 
Through the picture identification technique 
it may be possible to obtain quantitative 
measurement of many personality dimensions 
by correlating a S’s ratings of photographs 
along a given personality dimension with his 
affective reactions to the same photographs. 
A picture identification technique applied to 
personality measurement offers the advan- 
tages of objective, quantitative scoring, little 
reliance on verbal stimuli, and a measurement 
largely based on unconscious projection. These 
advantages and the results of this study in- 
dicate that further research in this area may 
be profitable. 


Summary 


Eighteen women and fifteen men under- 
graduate college students were given photo- 
graphs of unfamiliar college students and 
were asked to select pictures they judged to 
represent people high and low on the trait of 
self-assertiveness. Male Ss judged men’s pic- 
tures and female Ss judged women’s pictures. 
The Ss were then asked to indicate which of 
the same pictures they liked best and least. 
A measure of the strength and direction of the 
relationship between the Ss’ judgments and 
affective reactions was obtained from this pic- 
ture identification test. 
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The Ss also rated each other by a Q-sort 
technique for the trait of self-assertiveness, 
and scores on the Guilford-Martin Ascend- 
ancy and Lack of Inferiority scales of the 
GAMIN test were obtained. Correlations be- 
tween the picture identification test and the 
trait measures of lack of inferiority, ascend- 
ancy, and self-assertiveness were significant at 
the .01 level, the .05 level, and the .10 level, 
respectively. None of the correlations were 
high, but all were in a direction to support 
the hypothesis of positive identification with 
photographs of people of the same sex. The 
Ss who tended to have a positive relationship 
between their preferences for photographs and 
their ratings of the same photographs along 
a self-assertive personality dimension, tended 
to score high on measures of ascendancy, lack 
of inferiority, and self-assertiveness. A nega- 
tive relationship between the picture ratings 
and preferences was obtained for Ss scoring 
low on the traits of ascendancy, lack of in- 
feriority, and self-assertiveness. 

The results indicate that a picture identifi- 
cation technique may be profitably applied to 
personality measurement when the S is per- 
mitted to determine the stimulus value of a 
photograph. 
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The Goodenough Draw-A-Man Test as a Measure of 
Intelligence in Aged Adults’ 


Allan W. Jones? and Thomas A. Rich? 


Moosehaven Research Laboratory, Orange Park, Florida 


Historically, the Draw-A-Man test was de- 
veloped for children, with whom it has proved 
to be a successful indicator of general intel- 
lectual level (4). Until now, its use with 
adults has been confined primarily to mental 
defectives (1, 2, 5). There is certain prelimi- 
nary evidence, however, that this technique 
could be used with older normals as well. Al- 


though not concerned specifically with intelli- 
Tuckman, and 


ed quali- 


à marked correspondence 

Uman figure drawings and general perform- 
ance level on a psychomotor evaluation of an 
elderly group. In addition, height of drawing 
aPpeared to be related positively to perform- 


ance level, 

From these observations, it appeared that 
uman figure drawings of older adults might 
Þe useful as an indicator of their general 
intellectual] functioning. In this study the 


odenough test was employed for Bee! 
‘valuation and the Wechsler-Bellevue for the 


m % 5 
asure of intelligence. 
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Procedure 


The Ss in this study were 40 male resi- 
dents of a fraternal home for the aged (6). 
Their average age was 78.5 (SD = 6.2), 
ranging from 67 to 93 years. The modal edu- 
cation was 8 years with a range from 3-14 
years. All were volunteers, as well as partici- 
pants in an ongoing study of age changes in 
intelligence. 

Each S was tested individually in a single 
testing session of approximately 14 hours’ 
duration. The Wechsler-Bellevue Intelligence 
Test, Form I, was administered, followed by 
the Goodenough Draw-A-Man Test. For the 
Goodenough, each S was provided with an 
84-X-11-inch sheet of blank paper and a 

encil. He was then instructed to draw a man 
and to make the drawing as complete as pos- 
sible. No time limit was set. 

All the drawings were scored independently 
by both the authors according to the pro- 
cedure described by Goodenough (3). The 
total score for each subject was the average 
of the two separate ratings. The height of 
each drawing to the nearest millimeter was 
also determined once by each author. In the 
few cases where a slight discrepancy existed, 
the average of the two measures was used. 
In addition, Wechsler Full Scale, Verbal, and 
Performance IQs were obtained by means of 
the standard age correction (14). 


Results 


Goodenough reliability. The Pearson prod- 
-moment correlation between the two sets 
independent ratings on the Goodenough 
s compares favorably with other 
ting interjudge correlations rang- 


uct 


of s 
was .84. Thi 


studies repor 
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Table 1 


Descriptive Statistics on Wechsler and 
Goodenough Test Scores 


Mean SD Range 
Goodenough 21.86 10.29 2.0-45.0 
Drawing Height 10.24 4,34 1.0-18.4 
W-B Full IQ 99.80. 11.65 70-126 
Verbal IQ 100.48 12.07 72-126 
Performance IQ 105.18 9.08 80-128 
WB full wtd. score 62.63 22.20 9-111 
Verbal wtd. score 36.10 13.54 7-67 į 
Perf. wtd. score 26.53 10.29 2-53 4 


ing from .80 to .96 (10, 13). Previous studies 
(3, 10) with younger groups have obtained 
test-retest reliabilities from .68 to .94. Re- 
garding height of drawing, Kleemeier, Rich, 
and Justiss (7) found test-retest reliability 
with older males to be .80 over a two-week 
interval. Similarly, Lehner (8) reports no sta- 
tistical difference between drawing heights in 
two testings. 

Descriptive statistics. Table 1 presents the 
means, SDs, and ranges of Goodenough total 
score, drawing height, Wechsler IQs and 
Wechsler weighted scores for the 40 cases. 
Drawing height is reported in centimeters. 

Compared to Goodenough’s normative data 
(4), the average drawing score of 21.86 for 
the elderly group represents approximately 
the 8-year level. However, the SD of 10.29 is 
almost twice that found with the children 
(5.4), a finding due in part to the compara- 
tively greater age spread in the older sub- 
jects. 

Relation of Goodenough to Wechsler. In 
Table 2 are shown the intercorrelations be- 
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Fig. 1. Averaged Goodenough Test score plotted 


against W-B Full Scale IQs. 


tween Goodenough test score, drawing height, 
and the various W-B IQs and weighted scores. 
The r between Goodenough and height of 
drawing is .64. Both of these appear to pre- 
dict about equally Wechsler Full Scale, Ver- 
bal, Performance IQs, and full weighted score 
ranging from the highest 7 of .65 between 
Goodenough and W-B Full Scale to the low- 
est of .47 between drawing height and W-B 
Performance IQ. All these 7s are significantly 
different from zero beyond the .01 level. 

Figure 1 shows the 40 averaged Good- 
enough scores plotted against their respective 
Wechsler IQs. There are but four cases which 
deviate markedly from the rest of the sam- 
ple. All of them have had either special train- 
ing or an interest in art. 

The 7s between the Goodenough and the 
11 W-B subtests form the first column in 
Table 3. Next are listed these same rs with 
age partialled out. Since both the Goodenough 


Table 2 


Goodenough and Wechsler-Bellevue Intercorrelation Matrix 


A B (è D E F c 
A. Age 
B. WB Full IQ —.32* 
C. WB Verb. IQ —.14 95** 
D. WB Perf. IQ —.17 aoe .70** 
E. WB wtd. score —.28 98** 93** 83%* 
F. Goodenough —.49** 65** .60** 56** Aye 
G. Drawing Height —.31 62°" 58** .51** 63** 64** 


* p< .05 level. 
tp < .01 level. 
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The Draw-A-Man Test in Aged Adults 


Table 3 


” Correlations Between Wechsler-Bellevue Subtests and 
Goodenough Score, and Between W-B Subtests 
and Goodenough with Age Partialled 


Goodenough: 

Wechsler subtests Goodenough Age constant 
, Picture Completion 65 54 
+ Digit Symbol 60 45 
Comprehension 58 50 
nformation 54 48 
Object Assembly 53 52 
Picture Arrangement 353 38 
Ocabulary my) .49 
~ Arithmetic 50 42 
Block Design 49 35 
Similarities Al 30 
Digit Span B44 31; 


7 of 32 = .05 level; r of 41 = .01 level. 


and the W-B subtests correlate with age, 
Partial correlation coefficients were computed 
olding age constant. It was felt that the re- 
lationship between the Goodenough and the 
-B subtests would be more meaningful with 
the age factor controlled. Considering ma 
partial correlations, Similarities and pin 
pan do not correlate significantly (.05 leve ) 
with the Goodenough. Block Design and Pie 
ture Arrangement reach the .05 level, and ‘d 
other subtests are at the .01 level or ae 
owever, holding age constant does not a a 
‘sentially the general relationship kan 
© various subtests and the Goodenon “a 
"tom inspection of the partial correlatio n 
verbal and performance tasks app pis 
k about equally involved in drawins po mt 
Pe The highest relationships at€ fo Com- 
tcture Completion, Object Assembly, 


; mation. 
“tension, Vocabulary and on out of the 


hen age was also partia Š 
patelations between Wechsler Full ene 
ray ond Performance IQs, els 

Mg, no significant chang : 
owever, eer was a small but progressive 
OP in height of drawing with ag 66-78 
aga” figure height of 11.65 cms. voen 
ai 8roup, 9.72 cms. for those betv eeds 
age 85, and, finally, 7.96 cms. fOr $013) 
ine 86 and above. Other studies Con ae 
© reported similar’ decreases 7 elg 


Vi 
Â 
Eire drawing with age. 


GEDSER ee Sp 


237 


Qualitative differences. The percentage of 
the older males receiving credit for each of 
the 51 Goodenough items was computed. 
These percentages were compared to those of 
the normal 8-year-old children in the original 
Goodenough data (3), a group comparable 
only in terms of average Draw-A-Man score. 
Seven items differentiated young and old by 
at least 32 percentage points. The greatest of 
these differences showed that 98% of the chil- 
dren included clothing in their drawings, as 
opposed to only 31% of the oldsters. Too, 
children were more inclined to draw both 
arms and legs in two dimensions (86% vs. 
51%), and to score a point for motor co- 
ordination (77% vs. 45%). The older males 
drew ears more frequently (64% vs. 32%), 
indicated shoulders (50% vs. 12%), had the 
length of trunk greater than the breadth (81% 
vs. 49%), and drew the outline of the neck 
continuous with the head, trunk, or both 
(65% vs. 32%). 

The intrusion of variables other than age 
in comparing these children and older adults 
makes detailed interpretation unwarranted. It 
does appear, however, that there are many 
decided qualitative differences between young 
and old human figure drawings. Undoubtedly, 
a scale designed specifically for an older group 
could detail the differences much more ade- 


quately. 
Discussion 


Although the Draw-A-Man test is not of- 
fered as a substitute for a more complete in- 
telligence testing in the aged, it does appear 
that a quick estimate of intellectual level can 
be obtained by its use. Certainly, it would be 
most appropriate for those cases not amenable 
to a more formal psychometric evaluation. 
Through use of a scale modified along the 
lines Lorge suggests (9), it might be possible 
to increase the over-all intelligence prediction 

ccuracy. Still, it is quite interesting to note 

mae height of drawing predicts IQ, as well as 
the detailed Goodenough’ scoring procedure. 
To be sure, this finding needs cross valida- 
tion; but because of the generally poor qual- 
tio P drawing in the aged (average 8-year 
ire and decline in psychomotor control 
ee it may be that drawing height re- 
ev 3 
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flects not so much age as total available mo- 
tor output. This would undoubtedly be more 
likely true for the somewhat deteriorated case, 
where a certain minimal energy level to draw 
at all is needed. 

A final point to be considered is that the 
Goodenough test is undoubtedly influenced 
by both previous training and experience in 
art. For Ss with such a background the draw- 

` ing test constitutes a different kind of task 
and probably reflects training rather than in- 
tellectual level. The scatter plot shown in 
Fig. 1 lends support to this, with the four 
atypical cases being Ss either at present ac- 
tive in art or expressing strong interest in it 
in the past. 


Summary 


The Wechsler-Bellevue Intelligence Test, 
Form I, and the Goodenough Draw-A-Man 
Test were administered to 40 aged (M = 
78.5) male residents in a fraternal home for 
the aged to test the relationship between hu- 
man figure drawings of older adults and their 
general intellectual level. 

The two drawing variables obtained were 
Goodenough test score and height of figure 
in centimeters. Both measures correlated about 
equally with W-B Full Scale, Verbal, and Per- 
formance IQs, and total weighted score, with 
correlations ranging from .47 to .65 (p< .01 
level). Also, partialling out age did not alter 
the essential relationship between Goodenough 
and the W-B subtests. 

The average drawing score of 21.86 was 
comparable to that obtained by the typical 
8-year-old, although marked qualitative dif- 
ferences were found. It was suggested that the 
Goodenough could be used as a quick esti- 
mate of IQ in aged adults, but that a scoring 
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procedure designed specifically for such peo- 
ple might raise the over-all predictability. 


Received August 14, 1956. 


References 


1. Berdie, R. F. Measurement of adult intelligence 
by drawings. J. clin. Psychol, 1945, 1, 288- 
295. 4 

2. Earl, C. J. C. The human figure drawings of 
adult defectives. J. met. Sci, 1933, 79, 305- 
328. 

3. Goodenough, Florence L. Measurement of intelli- 
gence by drawings. Yonkers, N. Y.: World 
Book Company, 1926. 

4. Goodenough, Florence L., & Harris, D. B. Studies 
in the Psychology of children’s drawings: II 
1928-1949. Psychol. Bull., 1950, 47, 369-433. 

5. Gunzburg, H., C, Scope and limitations of the 
Goodenough drawing test method in clinical 
work with mental defectives, J. clin. Psychol, 
1955, 11, 8-15. 

6. Kleemeier, R. W. Moosehaven: Congregate liv- 
ing in a community of the retired, Amer. J. 
Sociol., 1954, 59, 347-351. 

7. Kleemeier, R. W., Rich, T. A, & Justiss, W. A. 
The effects of alpha-(2 piperidyl) benzhydrol 
hydrochloride (Meratran) on psychomotor 
performance in a group of aged males. J. 
Gerontol., 1956, 11, 165-170. 

8. Lehner, G. F, J., & Gunderson, E, K. Height re- 
lationships on the Draw-a-Person Test. J. 
Pers., 1953, 21, 392-399, 

9. Lorge, I, Tuckman, J., & Dunn, M. B. Human 
figure drawing by younger and older adults. 
Amer. Psychologist, 1954, 9, 420-421. (Ab- | 
stract) 

10. McCarthy, Dorothea. A study of the reliability 
of the Goodenough test of intelligence. J. Psy- 
chol., 1944, 18, 201-216, P 

11. Rich, Thomas A. Some personality variables ue 
human figure drawings. Unpublished master’s 
thesis, Univer, of Florida, 1955. 4 i 

12. Wechsler, David. The measurement of adult in- % 
telligence. (3rd ed.) Baltimore: Williams & ( 
Wilkins, 1944, 

13. Williams, J. H. Validity and reliability of the 
Goodenough Intelligence Test. Sch. & S0% 
1935, 41, 653-656. 


a 


£ 


ag 


a eee eaeee 


Journal of Co; sultis 
Vol. 21, No. 3. 1955 chology 


On WAIS Difference Scores 


Quinn McNemar 
Stanjord University 


Although differences between subtest scores tests, as 


on the Wechsler scales are of supposed diag- 
nostic significance, the recent WAIS Manual 
does not include information pertaining 

to norms for, or reliabilities of, any of the 55 
Possible difference scores among the 11 tests. 
On the basis of the reliabilities of the 11 
tests and the intercorrelations among the 


given in the manual, it is possible to 
extract data needed for evaluating difference 
scores. This note will present such data for 
ages 25-34, for which V = 300, 

The mean of the algebraic differences will, 
of course, be zero. The standard deviation of 
any distribution of differences is readily ob- 
tained by utilizing the well-known formula 


Table 1 

Setahilites Balom Die aes i 
Subtest pee an Ea) S Bae cs CS O ho rs) 
ia Bio ia in A 
2 Comprehension A Bee RBs ig ie Gs ee 
3. Arithmetic 9 9 teupeais (de T 
* Similarities C E a) Ms, Ora 16s 47 one 
5 Digit Span SS io) 19 ad et ages 
6, Vocabulary 63 A8 71 62 60 a a ia 20 te 
”. Digit Symbol hy a HO OE id” ae eae 13 
Picture Completion Bhd ST, 408 o e S 5S fr hee eee 
». Block Design Pe 2 A a za. a 
anes 43 30 41 S1 36 32 2.9 
. Picture Arrangement 30 21 42 2.5 

68 64 49 37 25 
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for the variance of the difference between 
correlated scores. The 55 needed (normative) 
standard deviations, cpp values, are given as 
the Arabic numbers above the diagonal in 
Table 1. It will be noted that these opp’s 
vary from 1.8 to 3.6 (in scaled score points). 

The reliability coefficients, calculated by a 
formula given by Kelley (1, p. 415), for the 
55 sets of difference scores are given below 
the diagonal. These coefficients vary from .84 
down to .25, with a median of .60. Only 10 
of the 55 reliabilities exceed the not so re- 
spectable value of .70. 

The 55 standard errors of measurement for 
the difference scores are set in italics above 
the diagonal in Table 1. These were calcu- 
lated by the usual formula for oe, i.e., o7ecay 
= opp (1 — faa), then checked by another 
formula from Kelley (1, p. 415). Note that, 
as might have been expected from the low re- 
liabilities, these oeay’s are fairly large relative 
to the respective SDs for distribution of dif- 
ference scores. In other words, a sizable por- 
tion of a given difference score variance is 
attributable to errors of measurement. 

The practical meaning of the foregoing 
should be obvious. When we consider that 
the search for “significant” differences from 
among 55 possible differences tends always to 
capitalize somewhat on chance, it would not 
seem unreasonable to insist that any obtained 
difference be about 2.5 times the appropriate 
cea) before accepting the difference as non- 
chance. But even when a difference is so 
judged as indicative of a real disparity in 
“ability,” it is still necessary to raise the 
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question as to its possible diagnostic signifi- 
cance. It is here that the SD of the distribu- 
tion of difference scores for the normative 
group needs to be considered. Take, for ex- 
ample, a difference of 5 for Arithmetic versus 
Comprehension. Such a difference is “signifi- 
cant” at the .01 level so far as error of meas- 
urement is concerned, but the fact that the 
normative distribution of differences between 
these two tests has an SD of 3.0 indicates that 
10 per cent of normals will have differences 
as large as 5 points. 

The above example was chosen to be typi- 
cal; the range for this sort of thing is from 
4 per cent to as many as 30 per cent of nor- 
mals yielding “significant” differences (as 
judged by error of measurement). Could it 
be that so many “normals” among the age 
25-34 standardization group are producing 
abnormal differences? 

When one considers the reliabilities and in- 
tercorrelations of the 11 tests given in the 
WAIS Manual for other age levels, there is 
no reason for believing that the reliabilities 
for difference scores at other age levels will 
deviate much from the figures given in 
Table 1 for the 25-34 age bracket. 
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The Effects of Scale and Practice on WAIS 
and W-B I Test Scores 
a Samuel Karson, Kenneth B. Pool, and Sheldon L. Freud 


| The newly standardized Wechsler Adult In- 
telligence Scale was developed to compensate 
} for several recognized defects in the older 
Wechsler-Bellevue Intelligence Scale, Form I.* 
Although a number of the original W-B I 
items have been retained in the WAIS, the 
new scale is claimed to provide a more ade- 
| quate range of item difficulty, more reliable 
| subtests, and to be based on a more repre- 
sentative reference sample (6). 

The introduction of the new scale has raised 
several problems for psychologists. Problems 
of equivalence are involved in cases where 
scores from both scales may have to be con- 
sidered together, as is frequently necessary in 
clinical, research, MNÀ statistical operations. 


| Problems of transfer from one scale to the 
t 
| 


Other will have to be considered when pa- 
lents are retested, for any reason, with a 
Second scale, Since there is no alte 
of WAIS, W-B I may frequently 
this way, 
t 


rnate form 
be used in 


ready been experi- 


These problems have al I 
ir Air Force situa- 


enced by the writers in the 


ion, and tudy was undertaken 
> the present stu TVAIS and W-B I 


sfer effects from 
f Air Force fly- 
chool of Avia- 


` assess the equivalence of 
5 ales and to evaluate the tran 
ae to the other on a sample 0 
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School of Aviation Medicine, USAF, Randolph Air Force Base, Texas 


tion Medicine, USAF, for medical and psy- 
chological evaluation. 

The specific questions investigated were (a) 
the equivalence of comparable scores on the 
two scales (this was studied both by correla- 
tion and by comparison of scores of the same 
subjects on both scales with order of adminis- 
tration rotated experimentally); and (b) prac- 
tice effects from WAIS to W-B I and also 
from W-B I to WAIS. Since the performance 
scale tests are generally speeded, and for the 
most part involve manipulation of the test 
materials, it was expected that transfer would 
be greater for these subtests. 

The sample consisted of 52 Air Force flyers 
whose average age was 33, with a range of 20 
to 39 years. Their mean number of years of 
education was 14, with a range of 12 to 16 
ars. These officers were referred to the De- 
partment of Clinical Psychology as a part of 
their consultation at the School of Aviation 
Medicine. They were referred for evaluation 
between July, 1955, and May, 1956, and were 
tested with the WAIS and W-B I as part 
of their diagnostic psychological evaluation. 
Each individual was administered the WAIS 
and W-B I by the same psychologist, the scale 
given first being alternated so that one-half of 
the sample had W-B I first, while the other 
half had WAITS first. Almost all of the testing 
was accomplished by one experienced ex- 
aminer, although two other examiners also 
participated in the test administration to a 
limited extent. The sample employed in this 
research is believed to be typical of the fly- 
ing personnel who are referred for consulta- 
tion in the United States Air Force. The mean 
and standard deviations of the WAIS Full 
Scale IQ for our sample are 121.98 and 6.50, 


ye 
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2 3 
Table 1 
Analysis of Variance Results of the Effects of Scale and Practice and Product-Moment Correlations 
Comparison of scales* Effects of practice* 
W-B I scores WAIS scores? First Second Product 
F admin, admin. F moment 

Variable M SD M SD ratio mean® mean° ratio ri 
Information 12.67 1.55 13.27 1.35 14.66° 12.79 13.15 5.50° .68 
Comprehension 13.77 1.65 15.06 2.08 21.96° 14.06 14.77 6.70° 43 
Digit Span 11.38 2.93 12.79 3.11 30.50° 11.75 12.42 7.02° 82 
Arithmetic 14.02 2.80 13.46 2.24 2.12 13,27 14.21 6.08° 37 
Similarities 13.17 1.96 13.00 1.71 0.72 12.77 13.40 9.56° .75 
Vocabulary 12.86 1.13 13.23 1.94 3.28 12.77 13.33 7.642 .67 
Picture Arrangement 13.04 3.22 12.90 2.98 0.10 12.12 13.82 14.94° 45 
Picture Completion 13.73 0.94 13.61 2.37 0.18 13.37 13.98 5.48° .62 
Block Design 14.08 1.89 13.50 2.12 5.82° 13.38 14.19 11.38° .65 
Object Assembly 14.00 1.88 14.06 2.62 0.02 13.23 14.83 19,34° .31 
Digit Symbol 12.42 2.20 11.75 2.75 6.16° 11.35 12.83 29.78° .70 
Verbal IQ 121.19 6.43 120.29 6.84 1.92 118.92 122.56 31.02° .72 
Performance IQ 125.65 8.61 121.44 8.85 10.36° 119.31 127.79 42.00° .25 
Full Scale IQ 125.38 6.43 121.98 6.50 16.40° 120.60 126.77 53.94° .46 


2 The scale and practice analyses of variance were evaluated using as the error term the variance of the residual subjects within 


cells, df 1 and 50. 


b Mean scores for the total sample with order of administration alternated, 
° Mean scores for W-B I and WAIS combined at each period of administration, 


d Anr of (33 is significant at the oe levels of confidence when N = 52, 
e Denotes significance. An F ratio of 4.03 is significant at the .05 level of confidence and 7,17 at the .01 level in the table. 


respectively, as compared with a mean of 
110.23 and a standard deviation of 25.25 on 
the WAIS standardization sample for the age 
range 20 to 34, It appears that the present 
sample overlaps the WAIS standardization 
sample principally in the superior range of 
intelligence. 

Bartlett’s test for homogeneity of variance 
(3) among the 14 corresponding measures 
(i.e., the 11 subtests* of WAIS and W-B I 
and their VIQ, PIQ, and FSIQ) yielded sig- 
nificant chi squares only for Voc (x? = 18.36, 
p= < .01) and PC (x? = 37.53, p= < .01). 
These results, which revealed significant het- 
erogeneity for Voc and PC, make the inter- 
pretation of the respective F ratios less cer- 
tain. For the other subtests the null hypothesis 
could not be rejected. Thereafter, three sepa- 
rate scale-, practice-, and scale-by-practice 


3 The following abbreviations are used for the sub- 
tests: Inf = Information; Comp = Comprehension; 
Arith = Arithmetic; Sim = Similarities; Dig Sp= 
Digit Span; Voc = Vocabulary; DS = Digit Symbol; 
PC = Picture Completion; BD = Block Design; PA 
= Picture Arrangement; and OA = Object Assembly. 
Also, the following were used: VIQ=Verbal IQ; 
PIQ = Performance IQ; and FSIQ = Full Scale IQ. 


analyses of variance (3) were completed for 
the WAIS and W-B I scores on each of the 
11 subtests and for the VIQ, PIQ and FSIQ 
scores. In all, 42 analyses of variance were 
done. The equivalence of subtest scores and 
IQ’s was further compared by means of Pear- 
son Coefficients of correlation, since it was be- 
lieved that the person-to-person subtest score 
equivalence could be low even though the 
group means might be similar. In order tO 
avoid the contamination of practice effects, 
the total sample was not pooled; instead, cot 
relations were computed between the 26 pe! 
sons who took each of the two scales first an 
then for the 26 persons who took each of the 
two scales last. Then, by means of Fisher's ? 
transformation (3), the two correlations were 
averaged. 


Results 


Table 1 presents a comparison of the means 
of the two scales, with order of administration 
controlled, and the effects of practice W E 
either scale was administered first. Tee 
the verbal subtests (Inf, Comp, and Dig AIS 
show significant mean differences, the W. 


ce 
scores being higher. Two of the performa” 
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Effects of Scale and Practice on the WAIS and W-B I 


subtests (BD and DS) also show significant 
mean differences in favor of the W-B I scores. 
Two of the W-B I IQ scores (PIQ and FSIQ) 
are significantly higher than the correspond- 
ing WAIS scores. These results appear to be, 
at least in part, a function of the different 
Weighting systems used on WAIS and W-B I 
m the transformation of the raw scores to 
Weighted scores, Specifically, a total weighted 
Score of 65 on the W-B I verbal subtests, 
Which js equivalent to a weighted score of 78 
on WAIS, yields a VIQ three points higher on 
W-B I than the VIQ on WAIS for the age 
range 20 to 24, and four to five points higher 
for the age range 25 to 34. It is of interest to 
observe in this connection that the age range 
represented by the present sample is very 
Similar to that of the sample selected as the 
Teference group upon which the scaled scores 
for the WAIS are based (6). The transforma- 
tions to IQ units of the scaled scores earned 
on the verbal subtests appear to have resulted 
n gteater comparability of the VIQ’s on the 
two scales; this F ratio is not significant, ae 
rush significant differences were observed 
etween these two scales on three of the verba 
Subtests, On the performance subtests, now 
Cver, the transformations to IQ units of the 
Scaled Scores apparently did not operate to 
reduce the significant differences observed he; 
tween the subtests of BD and DS, since t 4 
fatio between the two PIQ’s is significan 
"yond the .01 level of confidence. : 
ignificant practice effects were oP ae 
all of the 14 comparisons made without rê- 
Bard to scale, The verbal subtests were ap; 
Parently Jess affected by practice than ae 
Performance subtests, as evidenced by ae A 
“mination of the significance levels at th 
lOS reported in Table 1 for the eet a] 
Tactice, as well as by inspection of ie for 
the increments attributable to Pr nitude 
* individual subtests. The actual ma ost- 
bi the differences between the pre- and P 
inactice means should not be ta be ex- 
€x of the amount of transfer to ns rep- 
Pecteq on either scale, since these me aE. 
rent an average of two nonequivalent priori 
er è findings are in keeping wit ts woul 
eubtttations that the verbal subtes than the 
‘Ow Telatively fewer practice effects 
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Intercorrelations between W-B I and WAIS 
for the present study are listed in the last 
column of Table 1. Although most of the 
group means for W-B I and WAIS are quite 
comparable, it is observed that the Person-to- 
person correspondence for some of the sub- 
tests was low, notably OA, Arith, Comp, and 
PA. The low correlation between the two 
PIQ’s of only .25 is also notable. These re- 
sults suggest that one could not predict with 
much accuracy one subtest score from the 
other, on these particular subtests, for our 
sample. 

The scale-by-practice interaction is not re- 
ported, since the /nf subtest had the only sig- 
nificant F ratio here (F = 6.60, p= < 05). 
This interaction was evaluated by using the 
variance of the residual subjects between cells 
as the error term, with degrees of freedom 
being 1 and 50. This is interpreted to indi- 
cate that, for the most part, practice effects 
do not appear to depend on which of the two 
forms is administered first. 

Table 2 presents the means and standard 
deviations for WAIS and W-B I when each 
of these tests is administered first, as well as 
the Pearson product-moment correlation co- 
efficients for the 26 patients who were ad- 
ministered the WAIS initially, and the 26 
who were administered W-B I first. Here 
again, although the correlations between the 
two forms are quite low in some instances 
(notably an r of only .07 for Arith, .26 for 
OA, — .12 for PIQ, and .15 for FSIQ) when 
WAIS is given first, much similarity among 
the means is observed. Low correlations are 
also in evidence when W-B I is administered 
first, although again the means are quite simi- 
lar. The correlation between the two Comp 
subtests is only .33, that between the two OA 
subtests only .37, between the two subtests of 
PA .40, and between the two Voc subtests AS. 

In comparing the two sets of correlations 
in Table 2, it is difficult to understand the 
fluctuations in the observed correlations as- 
sociated with scale, i.e., which of the two tests 
-. giyen initially. With WAIS administered 
= A in contrast to when W-B I was taken 
aiy the correlation between the two Arith 
m E increased from .07 to .61; that be- 
E n the two PIQ’s from — .12 to .56; and 
te for the FSIQ’s from .15 to .69. More- 
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Table 2 
Comparison of WAIS and W-B I Scales for Two Subsamples (= 26) 
WAIS admin. first W-B I admin. first 
WAIS scores W-BIscores Product W-BIscores WATSscores Product 
moment moment 
Variaole M SD M SD z> M SD M SD re 
i 13.54 ‘1.45 13.31 1.35 58 . 12.04 1.48 13.00 1.18 75 
| eo eal 14.65 2.45 14.08 1.66 50 1346 158 1546 153 33 4 
Digit Span 12.46 3.09 11.73 2.98 90 11.04 284 13.11 3.09 “69 
Arithmetic 13.35 2.02 14.85 2.27 07 13.19 3.03 1358 244 61 
Similarities 12.58 1.60 13.38 1.67 86 12.96 2.19 13.42 1.71 58 
Vocabulary 12.69 1.81 12.88 1.37 81 12.85 0.82 13.77 1.91 45 
Picture Arrangement 11.96 2.62 13.81 3.24 1 12.27 3.02 13.85 3.01 40 
Picture Completion 12.92 2.35 13.65 0.78 51 13.81 1.08 14.31 2.18 70 | 
Block Design 13.50 2.22 14.88 1.58 St 13.27 1,83 13.50 2.01.75 
Object Assembly 13.31 2.78 14.85 1.79 26 13.15 1.56 14.81 2.20 37 
Digit Symbol 10.92 2.74 13.08 2.39 -67 11.77 1.76 12.58 2.48 -13 
Verbal IQ 118.73 6.11 123.27 5.38 -67 119.11 6.72 121.85 7.17 -76 
Performance IQ 117.08 7.33 129.77 7.66 = 121.54 7.46 125.81 8.04 56 
Full Scale IQ 119.08 5.43 128.65 5.19 AS 122.11 5.89 124.92 6.16 69 


a For significance at the por levels of confidence an r must 


over, when WAIS was given first, the three 
highest subtest correlations (notably Dig Sp, 
Sim, and Voc) showed a marked reduction in 
size when the order of administration was re- 
versed. No obvious explanation of these effects 
is apparent, although they may be only a 
function of the small sample size used in this 
study. 


Discussion 


These results are similar to those found by 
Barry et al. (1) who compared W-B I and 
W-B II on a sample of military flyers. Those 
authors reported significant practice effects on 
VIQ, PIQ, and FSIQ, as well as on Dig Sp, 
Arith, BD, and DS. The present study found 
significant practice effects on all three of the 
1Q’s, as well as on all of the verbal and per- 
formance subtests. Barry et al. also reported 
significant scale effects on Sim, PC, OA, and 
DS, while none of these particular subtests, 
except for DS, was found to be significant in 
this study. It may ke that the DS subtests on 
the Wechsler scales were more prone to the 
effects of both practice and scale than any of 
the other subtests. 

It is of interest also to compare the present 
findings with those of Cole and Weleba (2) 
who studied 46 college students on WAIS and 


= eae when N = 26, 


W-B I. They reported significant practice ef- 
fects for all three IQ scores upon administra- 
tion of the second test, and they observed 
that the greatest practice effects were found 
on the performance subtests. They obtained 
the following correlations between IQ scores 
for the two tests: verbal .87, performance .12, 
and full scale .52. Our corresponding correla- 
tions are verbal .72, performance .25, and full 
scale .46. They also reported correlations be- 
tween VIQ and PIQ of .33 on WAIS, and of 
-14 on W-B I. In our study with WAIS ad- 
ministered first, the Pearson r between WAIS |, 
VIQ and PIQ was .19, while the correspond- 
ing correlation was .14. With W-B I given 
first, the correlation between VIQ and PIQ oD | 
WAIS was .09 and the corresponding W-B I 
Correlation was .17. With regard to effects of 
Practice and the correlations of the verbal ana | 
performance IQ scores between and within the i 
two scales, it is readily apparent that there 

is much agreement between their results 2” 
ours. | 


Summary and Conclusions 


BS g 

This study compared the WAIS and W k | 

scales with respect to score equivalence T i 
effects of practice. The results with regat 


Effects of Scale and Practice on the WAIS and W-B I 


Score equivalence were as follows. An analysis 
of variance revealed significant differences be- 
tween the two scales on Information, Compre- 
hension, Digit Span, Block Design and Digit 
Symbol, as well as on the Performance and 
Full Scale 1Q’s. The three verbal subtests 
Were significantly higher on WAIS than on 
the corresponding W-B I subtests, while the 
two performance subtests and two IQ meas- 
ures were significantly higher on W-B I. 

On the whole, subjects tended to retain the 
Same relative rank on both tests, although this 
Was not true for Arithmetic, Object Assembly, 
Performance IQ or Full Scale IQ when WAIS 
Was administered first, or for Comprehension 
and Object Assembly when W-B I was given 
initially. The average correlations between the 
two tests are not regarded as high enough to 
Warrant their being used interchangeably be- 
Cause of the relatively low correlations be- 
tween several of the subtests, notably, Arith- 
metic, Object Assembly, Comprehension, and 
Picture Arrangement, as well as between the 
Performance IQ’s and the Full Scale IQ’s. 

Practice effects were also found regardless 
of which scale was administered first on all of 
the verbal and performance subtests, as well 
as on the Verbal IQ, Performance IQ, and Full 
Scale IQ. Scale-by-practice interaction re- 
vealed significant variation only for Informa- 
tion. Apparently, the Information subtest on 
WAIS yields significantly higher test scores 
than its corresponding subtest on W-B I 
independent of order of administration and 
practice effects. In view of the findings of 
this study it is concluded that W-B I is jior 
a satisfactory alternate for the WAIS an 
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that a need for an alternate form of the WAIS 
is indicated. 

Although the results reported here are at- 
tenuated in comparison with the WAIS stand- 
ardization sample, our findings reveal a num- 
ber of important differences between the two 
scales which might be critical in clinical and 
research work. As with all new tests, a con- 
siderable body of statistical information bear- 
ing on validity with practical criteria must be 
accomplished before full acceptance can be 
granted for the WAIS. In the meantime, the 
improved standardization of the WAIS is a 
strong argument in favor of its adoption by 
clinical psychologists. 


Received September 5, 1956. 
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A Comparison of Two Methods of Estimating 
Full Scale IQ From an Abbreviated WAIS 


Philip Himelstein* 


Air Force Personnel and Training Research Center 


Recent studies have shown that there is 
a very dependable relationship between Full 
Scale WAIS IQ and the Doppelt regression 
equation (1) for both normal and psychiatric 
groups. In using the Doppelt equation, the 
scaled scores of the Arithmetic, Vocabulary, 
Block Design, and Picture Arrangement sub- 
tests are added, and this sum is multiplied by 
2.5. A constant based on the subject’s age is 
added to the product to determine the esti- 
mated Full-Scale Score. The purpose of this 
study is to compare the efficiency of this re- 
gression equation with the more usual clinical 
procedure of prorating the scores of the same 
four subtests. 

The subjects consisted of 61 hospitalized 
psychiatric patients who were tested on the 
admission service of a psychiatric hospital 
with a full WAIS. Of the 61 patients, 29 were 
diagnosed schizophrenic reaction and 12 were 
brain damaged. The WAIS was scored three 
times: the usual calculation procedure de- 
scribed in the manual (2), with the Doppelt 
equation, and the Verbal and Performance 
scores prorated separately and then summed 
to estimate Full Scale Score. 

The mean Full Scale IQ of the present 
group was 87.3. The means of both the re- 


1An extended report of this study may be ob- 
tained without charge from Philip Himelstein, Per- 
sonnel Laboratory, Air Force Personnel and Training 
Research Center, Box 1557, Lackland AFB, Texas, 
or for a fee from the American Documentation In- 
stitute. Order Document No. 5102, remitting $1.25 
for microfilm or $1.25 for photocopies. 
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gression and proration methods of scoring 
were an identical 88.5. The average deviation 
from Full Scale IQ obtained with the regres- 
sion equation was 4.0, and for the prorated 
scores, it was 4.5. 

When the Full Scale IQ (including the 
four subtests) was correlated with IQs ob- 
tained with the regression equation, a coeffi- 
cient of .954 was obtained. Full-Scale and 
prorated IQs correlated .953. The two meth- 
ods of estimating Full Scale IQ correlated 
-988 with each other. 

This study indicates that, for the four 
WAIS subtests investigated, both the regres- 
sion equation and the proration method of 
estimating Full-Scale WAIS scores are about 
equal in effectiveness. The clinician should 
feel free to use the method with which he 
is most comfortable. It would appear that 
Wechsler (2, p. 31) was overly cautious in 
warning against the use of less than five Ver- 
bal subtests and four Performance subtests in 
a prorating procedure. 


Brief Report. 
Received January 8, 1957. 
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This paper represents an attempt to confirm 
+ the validity of two MMPI scales that have 
been proposed for distinguishing normals from 
Persons with pathological symptoms. The 
Scales are the Barron Ego-strength scale (1) 
and the Welsh Anxiety Index (5), both of 
which have been shown by their authors to 
distinguish normal subjects from several 
1 groups of psychiatric patients (see also 3). 
The question being asked here is whether 
these indices generalize so widely that they 
hold up in a different culture to that of the 
U.S.A.; namely, in Australia. It has already 
been demonstrated (4) that, with a normal 
group, the only possibly important differences 
on the MMPI scales are higher Australian 
‘scores on Mf and Pd for males and Mf, D 


and Sc for females. ` 


Subjects 

Normals 
The group taken 
normals consisted of 40 ma n 
_ first-year psychology students. These subjects 
= were administered the MMPI test together 
with the rest of their class, as part of their 
regular course work. In order to facilitate 
Matching with the clinical sample, only stu- 
dents aged over 20 years were used and the 
50 subjects were chosen at random from this 
Pool. Some results, however, will also be 


Quoted from the other students. 


as representative of the 
le and 10 female 


Patients 

The psychiatric 
a pool of outpatient 
_ 1 The author thanks Miss 


Sistance with the computation: 
and Mr, J. R. E. White for ma 


Psychiatric records. 


patients were drawn from 
s and inpatients at a Te- 
Ronnie Jennings for as- 


al work for this report, 
king available the 
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The Validity of the Barron Ego-Strength Scale 
and the Welsh Anxiety Index’ 


Ronald Taft 


University of Western Australia 


patriation (veterans) hospital. All patients at- 
tending the psychiatric ward of the hospital 
were given the MMPI by routine. The out- 
and inpatients were about equal in number. 
Approximately two-thirds of the patients were 
diagnosed as psychoneurotic and psychopathic 
personality and the remainder as psycho- 
somatic or psychotic cases. The sample used 
in this present study was selected to match 
the student sample on sex, age, and educa- 
tional level as far as possible. The matching 
data are reproduced in Table 1. 


Table 1 
Characteristics of Australian Sample 
Number Age 
Minimum 
Group Male Female Mean Range education 
Students 40 10 32 21-54 12 years 
Patients 40 10 35 20-54 10 years 


Results 


For comparative purposes I have averaged 
all the relevant means and standard devia- 
tions published in the articles by Barron (1) 
and Welsh (5). In the case of the Ego- 
strength scale (Table 2) the U. S. mean is 
taken as the mean of the six studies quoted 
by Barron that refer to psychiatric samples 
(363 cases). The standerd deviation repre- 
sents the median of the SD in these studies. 
At least one of Barron’s samples included 
both men and women, and the sex distribut- 
tion in the other clinical samples is unknown. 
A study by Quay (3) using student nurses as 
the normal group and female inpatients as 
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l 
Comparative Scores on the Barron Ego-Strength Scale i 


Table 2 
Male Female Mixed 
Sample M SD M SD M SD 
U. S. A. Normals 
1. Barron: Graduate students (N = 40) 40.9 5.6 
2. Barron: Air Force Officers (N = 60) 52.7 4.0 [s 
3. Quay: Student nurses (W = 92) 45.1 5.8 
Australian Normals 
4. Taft: 1st year Psychology (matched group) 47.4 4.7 
(N = 50) 
5. Taft: 1st year Psychology (unmatched group) 48.8 4.0 422 50 46.9 43 
(N = 50 males, 20 females) 
U.S.A. Psychiatric Patients 
6. Barron (6 samples) (N = 363) 41.5 79 
7. Quay (N = 74) 37.5 7.2 
Australian Patients 
8. Taft—matched group 38.6 9.7 33.2 10.0 37.5 9.8 


(N = 40 males, 10 females) 


the clinical group is also quoted as our re- between U, S. and Australian results for the 
sults suggest that females score lower than Anxiety Index (AI). The U. S. figures were 


males on the Ego-strength scale. derived by obtaining the mean of all of the 
The comparative results on Ego-strength relevant samples quoted in (2) and (5). Psy- 
(Es) are reported in Table 2. choneurotic cases described as “mild” have 


A similar comparison is made in Table 3 been omitted. 


Table 3 
Comparative Scores on the Welsh Anxiety Index 
Sample N M 
U. S. Normals 
1. Goodstein—Male students >550 
2. Black—Female students* 5 5000 ear 
3. Welsh—Mean of 13 normal samples <1700 48.6 (SD of one sample, 16.0) 
Australian Normals 
4. Taft—Matched sample 50 
5. Taft—Unmatched sample of students SURSA 
Male 65 54.6 
Female 67 51.9 
<25 yrs 85 55.7 
>25 yrs 47 48.6 
Total , 132 53.2 


U. S. Psychiatric Patients 
6. Welsh—Mean of 24 samples >300 79 (SD, 27-31) 


Australian Patients 
7. Taft—Matched sample 50 78.6 (SD, 31.0) 


a Computed from figures quoted in (2) representing the combined means of varied student samples. 
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Validity of Ego-strength Scale and Anxiety Index 


The following summarizes the findings: 


1. The Es scale distinguishes the Australian 
normals (sample 4) from the matched sam- 
ple of psychiatric patients (é is significant at 
the .01 level), Taking 43 as the breaking 
Point score, 78 per cent of the normals ex- 
ceeded this compared with 24 per cent of the 
Patients. 

2. The AJ distinguishes the Australian nor- 
mals (sample 4) from the matched sample of 
Psychiatric patients (¢ is significant at the 
0.01 level). Taking 66 as the breaking point 
Score, 18 per cent of the normals exceeded this 
Compared with 62 per cent of the patients. 

3. The scores for Australian male freshmen 
Students on the Es scale were 2.1 points be- 
low the U. S. graduate students, significant 
at the .05 level. 

4. The scores for Australian female fresh- 
men students on the Es scale were 2.9 points 
below the U. S. student nurses, significant at 
the .05 level. a 

5. The scores for the Australian psychiatric 
patients on the Zs scale were 4.0 points be- 
low the U. S. patients, significant at the .01 


level. ; 
6. The mean AJ scores for the Australian 


male and female students (sample 5) did not 
differ from the corresponding U. S. subjects 
(samples 1 and 2). ; d 

7. The mean AJ for the Australian patients 
was virtually the same as that for the U. S. 
patients. For the three samples quoted (N= 
223), 76 per cent of Welsh’s patients he oe 
a score of 60, compared with 64 per cent o 
the Australian patients. This difference is not 
Signi i 

ote i male students obtained 
higher Æs scores than the female, and y 
ron’s male graduate students obtained hig e 
Scores than Quay’s female nurses. Bot R 
these differences are significant at the . 
le l 
o is Australian male students ees 
higher AZ scores than the female, but this : - 
ference is not significant. In the U. S. nE 
of students, the males also ee ie 
tially higher AZ scores (55.1 versus 30.0). 4 

10. On the Australian student sample la 
Ple 5), the younger subjects (under hie 
tained a higher AZ than the older, signi 
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at the 0.05 level. This is consistent with the 
finding of Wauck (5, pp. 69-70) that the 47 
of schizophrenic patients decreases with age. 


Summary and Conclusions 


The validation of the Barron Ego-strength 
scale and the Welsh Anxiety Index was 
checked by comparing a sample of 50 Aus- 
tralian students with a matched sample of 50 
patients in a psychiatric clinic. The following 
represent the overall conclusions: 


1. The validity of the Es scale is confirmed 
in that it successfully distinguishes the nor- 
mal from the clinical subjects. Its validity 
thus generalizes beyond the American culture. 

2. The norms for Australian students and 
patients are consistently below the Ameri- 
cans. Since this applies to patients as well as 
both male and female students, this suggests 
that the significance of some of the items 
varies with the culture, rather than that the 
differences in the Es scores reflect real per- 
sonality differences between Australians and 
Americans. The latter is possible, however. 
(See the discussion on this point in 4.) 

3. Males score higher than females on Es. 

4. The validity of the AZ is confirmed in 
that it successfully distinguishes the normal 
from the clinical sample. Its validity thus 
generalizes beyond the American culture. 

5. The norms for the Australian students 
and patients do not differ from the American 
norms for the AJ, 

6. Male students obtain higher AJ scores 
than female. 

7. The AZ decreases with age. 
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Subtle and Obvious Test 


Benno G. 


University of Michigan 


Recent research (6, 7, 9, 10, 11, 12) with 
the subtle and obvious scales for the Min- 
nesota Multiphasic Personality Inventory 
(MMPI) raises some important interpretative 
problems. While it is not puzzling to find that 
hospitalized psychiatric patients and other 
maladjusted and unsuccessful groups obtain 
high T scores on the obvious scales, it is dis- 
concerting to find that groups of recovered 
psychiatric patients, successful trainees, suc- 
cessful salesmen, and college sophomores ob- 
tain higher T scores on the subtle scales than 
the “normal” MMPI population, and higher 
T scores than groups of unrecovered psychi- 
atric patients, unsuccessful trainees, etc. Since 
each of the items in the subtle scales was 
originally selected by Hathaway and Mc- 
Kinley because it discriminated between nor- 
mal and abnormal groups, it appears that 
there is something common either to the items 


Items and Response Set 


$ 


i 
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or to the groups which influences the size of 
T scores more than does the original discrimi- 
nating power of the items. 

It is quite possible that groups obtaining 
high T scores on the subtle scales have a re- | 
Sponse set to answer “false” to the items in 
the MMPI and consequently are “caught” on 
a disproportionate number of items which are 
scored false (2). That this explanation is 
plausible can be seen from Table 1. It shows 
that Wiener’s division of the MMPI items 
(10, 12) for five clinical scales into subtle (S) 
and obvious (O) items tends to make “false” 
the scored response for the S items. Four of 
the five S scales have a large majority of their l 
items scored false; only the S-Ma scale has 
more true than false responses. It is of inter- | 
est that it is the S-Ma scale which does not l 
separate the groups studied by Wiener (10, 
11). As well as the number and percentage of 


Table 1 
Scored Responses for Five MMPI Scales Containing Subtle and Obvious Items 
MMPI scale 

D Hy På Pa Ma 
Scored response T § OQ T $s O° < 5 © f s o mi 
True oo x B A Gilman g a pe 
False 40 17 23 at 3 20 36. ig is n2 3 | ON 
Total Oo 2 DO 2e 2 HOB me oo eS ee 

i 7 

Percentage true 33 15 43 22 4 38 a8 18 7i 63 29 87 76 65 8 
Peenlagetilse: 67 85 o7 Migr Mt oy a a 7 / 
Percentage dif. —34 —70 —14 —56 —92 ~24 -4 —64 42 26 —42 74 52 30 1F 
T score when all 106 
marked true 58 21 83 4 26 7 75 33 94 100 47109 9774 
T score when all 26 © 
marked false 106 77 101 106 83 92 81 86 58 mw y s 43 59 
T score 15 80 
diference 8-48-56 18 -62 -SY =21 <6 <53 ag. aglaw COA 
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Table 2 


Correlations Between K and Subtle and 
Obvious MMPI Scales 
(N = 144) 


MMPI scale 


—.48 -.60 —.50 —.65 


Scored responses, Table-1 gives the standard 
Score a person would have if all items are 
marked true or if all items are marked false. 
The magnitude of the differences in percent- 
ages and standard scores should ‘be noted. 
The small number of items in each of the S 
and O scales and possible inaccuracies in clas- 
-Sifying the items as S and O? should be con- 
Sidered in evaluating the response set explana- 
tion, 
It has been previously suggested (2) that 


1Subtlety-obviousness for the five scales was de- 
termined rationally and not empirically, by Wiener; 
Subtle and obvious scales were not formed for the 
Other clinical scales because Wiener felt that they 


Contained too few subtle items. 


sn all pmo woot us gn a Spa ng e a 
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the K scale (having 29 of its 30 items scored 
false) was essentially a measure of the re- 
sponse set to answer false to personality test 
items. If K is a measure of this response set, 
then according to the true-false imbalance in 
the S and O scales in Table 1, it would be ex- 
pected that the S scales, with the exception of 


Item 
aoe p Hy På Pa Ma SMa, would correlate positively with K, and 
Subtle 67 77 38 6 o2 the O scales would correlate negatively with 
K. The correlation coefficients (7) for a sam- 


ple of 144 out-patient psychiatric patients are 
shown in Table 2.* Nine of the ten expecta- 
tions are fulfilled; only the S-Ma scale has 
misbehaved, but only by a small amount. It 
is probable that K or some other response-set 
scale * could be used as a suppressor variable 
to improve the validity of the S and O scales 
(i.e., scores from a scale such as K would 
have to be subtracted from the scores of four 


2 The writer is grateful to Drs. Sherman E. Nelson 
and Daniel N. Wiener for supplying the scores from 
which the 7’s were computed. Dr. Nelson indicated 
in a personal communication that the sample con- 
sisted of serial cases who came for treatment to the 
Veterans Administration’s Mental Hygiene Clinic in 
St. Paul, Minnesota. 

3A scale which may ‘be useful for this purpose is 
described in “A Response Bias (B) Scale for the 
MMPI.” J. counsel. Psychol., 1957, 4, in press. 


Table 3 


Correlations Between K and Uncorrected MMPI Clinical Scales 


MMPI scale 


s i Sex N D Hy Pd Pa Ma 
ample 
Normals 
Bid j M 100 o8 53  —09 2 8642 
raduates ( M 112 —07 50 —09 19 —30 
College (8) M 179 —20 21 —26 —12 —37 
College (1) M 100 15 48 ii} —07 —36 
Nimal 2 F 100 —03 30 —06 —02 —28 
orma! 
—03 48 —09 —02 —36 
Median correlation coefficient 
Abnormals —19 14 =38 ~ 2 —45 
NP patients (8) x uo —04 15 —16 —15 —10 
WE S pee M 100 7 10 -3i = =i 
patients (13) M 144 —2 5 as 37 
a patients M 100 = T E E e 
ERE a l i -9 o -3 -12 —4 
normals F 63 3 
H 7 
ysterics (3) = i3 -R 17 ET 


Median correlation coefficient 
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of the S scales, and would have to be added 
to all of the O scales). 

On the basis of some data and theoretical 
considerations, Wiener (10) suggested that 
the S items are best for assessing the person- 
ality of normal persons and that the O items 
are best for abnormal persons. If it is true 
that S items are more likely to function in a 
normal population and O items in an abnor- 
mal population, then according to the present 
contention that K operates as a measure of 
response set, a difference in 7’s should be re- 
flected in normal and abnormal groups. Spe- 
cifically it would be expected that the 7’s of 
K with the full scales for D, Hy, Pd, Pa, and 
Ma would be more positive, or less negative, 
in a normal than in an abnormal population. 
The 7’s presented in Table 3 reveal that four 
of the five expectations are fulfilled; only 
the 7’s between K and Ma are not substan- 
tially different. The data summarized here up- 
hold Wiener’s contention that different items 
function in normal and abnormal groups. A 
tentative hypothesis drawn from the data is 
that the more positive the correlation between 
scores from a measure of response set to an- 
swer false and scores from clinical scales (com- 
posed of subtle and obvious items), the more 
likely it is that the group is well adjusted or 
successful. More research, preferably with an 
instrument other than the MMPI, is needed 
to further test the hypothesis. 


Summary 


A response-set explanation was offered to 
account for the repeated finding that rela- 
tively well-adjusted and successful persons 
obtain more abnormal scores on the subtle 
scales of the MMPI than maladjusted and 
unsuccessful persons, Evidence was assem- 
bled through an analysis of the subtle and 
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obvious scales which supports the previously 
formulated (2) response-set interpretation of 
the K scale. 
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Testing for Stimulus Equivalence among Authority 
Figures by Similarity in Trait Description’ 
Donald T. Campbell i 


Northwestern University 


and Jean P. Chapman 


University of Illinois, College of Medicine 


A most common feature of personality the- 
ory and projective testing is the assumption 
of some degree of stimulus equivalence be- 
tween father and other persons in authority 
roles. That such should be the case would be 
predicted from most theories of learning and 
perception. The truth of the assumption seems 
to be confirmed daily in clinical practice. Few 
topics could bring more unity of prediction 
from competing psychological theories. And 
yet an effort to test the hypothesis in a quan- 
titative fashion utilizing a series of measures 
of attitude toward father, boss, and fellow 
worker failed utterly (1). The present study 
represents another effort toward the same end. 

One person’s characterization of another 
contains both systematic variance correlated 
with the person being described and system- 
atic variance correlated with the person doing 
the describing. The first we regard as accurate 
knowledge, the second is often characterized 
as projected content, reflecting the personality 
of the judge or the residues of prior experience 
in which the person being judged was not in- 
volved. In conditioning language, the response 
to the stimulus of another person represents 
in part habits learned through the rein- 
forcement of prior responses to that specific 
Person (the valid component), and in part 


1 This study was supported in part by the United 
tates Air Force under contract No. AF 18(600)-170 
Monitored by the Crew Research Laboratory, Air 

orce Personnel and Training Research Center, Ran- 
dolph Air Force Base, Randolph Field, Texas. Per- 
Mission is granted for reproduction, translation, pub- 
ication, use and disposal in whole and in part by or 
for the United States Government. 


response tendencies coming from learning 
situations involving other persons, transferred 
to the present person on the basis of stimulus 
generalization (the projected component). In 
the terms of one or another cognition theory. 
the percept or concept of the other person 
represents in part a veridical component 
(greater the more structured the situation or 
the more reality testing that has taken place) 
and in part meaning introduced through as- 
similation of the percept to preexisting sche- 
mata, sets, hypotheses, attitudes, archetypes 
or the like. For either point of view, or for 
earlier psychoanalytic notions, the projected 
component would tend to make some of the 
descriptions which one person makes of sey- 
eral others more similar than the “real facts” 
of their personalities would justify. Such simi- 
larity in trait descriptions is employed in this 
study to infer generalized response tendencies 
or attitudes. 

Let us consider a person’s description of 
his boss, of a subordinate whom he super- 
vises, of his father, and of his younger 
brother. “Projected” or “generalized” vari- 
ance might be expected to spuriously increase 
the similarity of the descriptions in terms of 
two pairs of attitudes: The similarity of de- 
scription between father and boss should be 
enhanced by any generalized attitude toward 
authority figures, or any generalization to cur- 
rent authorities of responses learned to the 
father in childhood. And while the status of 
siblings is not as clearly hierarchical as is the 
employment situation, one might expect that, 
through similarity of role, the subordinate 
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would elicit responses originally learned to the 
younger brother, generating some unwarranted 
similarity between the descriptions of the two. 
In orthogonal fashion, boss and subordinate 
both come from the current work-a-day world, 
and might be jointly influenced by general 
attitudes toward the work situation. Like- 
wise, father and younger brother descriptions 
should both be affected by any generalized 
attitudes toward the childhood home and 
family. In any single set of descriptions the 
tendencies created by such attitudes would 
be overshadowed by actual differences in the 
personalities involved, and by other specific 
sources of variance—but in a series of com- 
parisons, these slight tendencies should show 
some overall effect. 


Method 


As a part of a larger study attempting to 
measure generalized attitudes towards su- 
periors and subordinates, men of various Air 
Force samples filled out a questionnaire en- 
titled “Description of Self and Others.” The 
instrument provided a list of thirty person- 
ality traits. The respondent was asked to de- 
scribe a number of persons by use of these 
traits, picking the 8 best fitting traits and 
the 8 terms most definitely not appropriate, 
leaving the 14 intermediate terms unmarked. 
Where appropriate to his situation, the re- 
spondent described, among others, his father. 
his next younger brother, his immediate su- 
pervisor (or “boss”) and a subordinate, a per- 
son whom he supervised. It was judged maxi- 
mally efficient to limit the initial analysis to 
respondents who described all four of this 
symmetrical set of persons, From 91 preflight 
cadets tested at Lackland Air Force Base in 
January 1953, 15 such persons were obtained, 
From 70 officers of B-29 Bomber crews tested 
at Randolph Air Force Base, 17 met the re- 
quirements. From 77 enlisted bomber crew 
members, 17 cases were obtained. 

For each case an index was computed to 
express the similarity between pairs of de- 
scriptions. For this purpose, a Q-type (3, 4) 
correlation coefficient was used, with an N of 
30 traits scored on a three-step scale, “doesn’t 
apply,” “omitted,” and “applies.” The forced 
distribution of trait assignments provides 
equated means and deviations for each de- 
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Fig. 1. Analytic dimensions and mean similarity of 
descriptions for the basic 49 cases. 


scription, simplifying the computations. For 
each respondent, six such coefficients were 
computed, expressing the similarity of de- 
scription for all six pairings of four persons 
described. The relative size of these coeffi- 
cients provides the argument of this paper. 
While in general the descriptions were favor- 
able and only 9 per cent of the coefficients 
negative, sufficient range was obtained to 
make the comparisons seem appropriate. 
The comparisons to be made are indicated 
in the four-cornered, two-dimensional schema 
shown in Fig. 1. Pairs of persons in the same 
column, or in the same row, have one of the 
four hypothesized general attitudes in com- 
mon, and consequently should show a higher 
degree of correlation than is found for pairs 
diagonal from each other. This is not to say 
that the correlation between diagonal pairs 
represents only “true” similarity, for there 
are undoubtedly still more general attitudes 
or response sets which make all four of thé 
descriptions more similar than they should be- 
A general attitude toward other men, Of 
still more general philanthropy-misanthroPy 
attitude, or predilections for certain oak 
terms, all would lead to unwarranted un 
larity among descriptions. But the row ai 
column pairs should have the additional $ ude 
larity coming from the more specific attit 
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under study, and this greater similarity is the 
object of the Present search. Figure 1 indi- 
Cates the mean correlation between each of 
the possible pairs of the four descriptions for 
the 49 subjects (Ss) (computed by the 2’ 
transformation). As a means of evaluating the 
Significance of the differences between the 
Various average values, sign tests, based upon 
the frequency one value exceeded the other 
for individual Ss, have been made. These com- 
Parisons are summarized in Table 1, and will 
be explained in more detail below. 


Results 


Generalized attitude toward authority. In 
terms of the analysis presented above, if there 
are generalized attitudes toward persons in 
authority, then descriptions of Father should 
Correlate more highly with descriptions of 
Boss than with descriptions of Subordinate. 
The trends in the mean rs and in the tallies 
of Table 1 are in the expected direction, 
reaching the .02 level of significance. By the 
Same reasoning, Boss should correlate more 
highly with Father than with Younger 
Brother. Again the findings are positive, at 
the .03 level. Although the results are con- 
firmatory, the magnitude of the correlation 
difference and the level of significance are 


probably less than one would have expected 
from such an ubiquitous hypothesis. 

Generalized attitude toward subordinates. 
If there are generalized attitudes toward sub- 
ordinates, then Younger Brother should cor- 
relate more highly with Subordinate than with 
Boss. While mean values show a slight sup- 
porting trend, the talley of individual trends 
is in the opposite -direction. Similarly, Sub- 
ordinate should correlate more highly with 
Younger Brother than with Father. No sig- 
nificant tendency in this direction is found. 

Generalized attitude toward family, Atti- 
tude toward family should appear in Father 
correlating more highly with Younger Brother 
than with Subordinate. Similarly, Younger 
Brother should correlate more highly with 
Father than with Boss. Chance levels are ob- 
tained for both predictions. 

Generalized attitude toward current asso- 
ciates. Over-all attitudes toward the current 
work situation such as would influence the 
evaluation of persons both above and below 
in the table of organization would lead to the 
prediction that Boss would correlate with 
Subordinate more highly than Boss would 
correlate with Younger Brother, And Sub- 
ordinate should correlate higher with Boss 
than with Father. While the means in Fig. 1 


Table 1 
Proportions of Comparisons of Coefficients in Expected Direction 


Variables compared 


Authority Attitude 


Father-Boss > Father-Subordinate 
Father-Boss > Boss-Younger Brother 


Subordinate Attitude 
h.-Boss 
Younger Broth.-Sub. > Younger Brot 
oine Broth.-Sub. > Subordinate-Father 


Family Attitude 3 
F -Subordinate 
Father-Younger Broth. > Father 
ether deuce Broth. > Younger Broth.-Boss 


Work Situation Attitude 
Boss-Subordinate > Boss-Younger Brother 
Boss-Subordinate > Subordinate-Father 


Bomber 
Bomber enlisted 

Cadets officers men Total 

(15) (17) (17) (49) b 
11/15 12/17 7/13 30/45 02 
8/12 8/14 11/15 27/41 03 

3/15 7/15 7/15 17/45 

5/11 6/14 8/15 19/40 

11/15 4/13 6/15 21/43 

8/15 8/17 8/16 24/48 

6/14 9/17 10/17 25/48 

7/14 7/16 6/15 20/45 


esent the total of the cases not tied, and are thus frequently smaller than the total N indicated 
Note.—The denominators repr 


tt the column heads. 


256 


support the hypothesis, it receives chance con- 
firmation or less in the individual tallies of 
Table 1. 

Extension of father-boss comparisons. Be- 
cause the strongest trends obtained were for 
the prediction most strongly presaged in psy- 
chological theory, it seemed appropriate to 
extend the number of cases on this point by 
including those where full symmetry of de- 
scriptive pattern was absent. From the appro- 
priate remaining cases in the three samples, 
comparisons were made for the prediction that 
Father and Boss should correlate higher than 
Father and Subordinate. This prediction was 
confirmed in 43 out of 70 untied instances, 
giving a one-tailed p value of .04. Combining 
this with the 30 confirmations out of 45 in- 
stances reported in Table 1 for the same com- 
parison, gives an over-all score of 73/115 
which has a one-tailed p value of .003. Only 
eight additional instances were available for 

the prediction, Father correlates higher with 

Boss than does Boss with Younger Brother. 

For these, the ratio of confirmations was only 

3/8. Adding these 8 to the 41 in Table 1 gives 

a total score of 30/49, which is significant at 

only the .08 level by a one-tailed test. 


Some other relevant comparisons are generated by 

substituting other persons for the Subordinate or 
Younger Brother, to control for the family, job situa- 
tion, and general halo factors. Thus, for cases where 
there is no Younger Brother, the description of an 
Older Brother should correlate less highly with Boss 
than does Father with Boss, although inasmuch as 
Older Brother has some authority role, the contrast 
should be Tess sharp. For 38 such untied instances, 
the prediction is confirmed in 23, giving a p value of 
-13. In addition, the Father-Boss correlation should 
be higher than that between Father and Peer (“one 
of your equals with whom you have closest deal- 
ings”). This prediction is confirmed in only 37 out 
of 85 untied cases (including cases from the original 
49). Not only does this finding fail to confirm the 
hypothesis, but it shows a trend in the opposite di- 
rection. It is difficult to explain this failure in the 
face of the other supporting data. 

It seemed possible that the correlation between de- 
scriptions might be a function of the degree of fa- 
vorableness of the descriptions, and that if Peers 
were judged more favorably than Bosses or Subordi- 
nates, they might correlate more highly with Father 
for that reason. To check, the descriptions of the 
basic 49 cases were scored for favorableness. Sur- 
prisingly enough, Younger Brothers were described 
most favorably, Peers least favorably, with Bosses, 
Subordinates, and Fathers between at about the 
same level. Thus degrees of favorableness cannot pro- 
vide an explanation for these inconsistent data. 
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For middle class boys the Mother is ofterh more 
frequently the work-supervisor than the Father. If 
generalization is based upon role similarity, one{ might 
expect Mother-Boss correlations to be higher than 
Mother-Subordinate and Mother-Peer correlations. 
This prediction was confirmed in only 33 out of 65 
instances for Mother-Subordinate, showing no rela- 
tionship. For Mother-Boss greater than Mother-Peer 
(computed only where no Subordinate was de- 
scribed) the score is only 18/48, significant in the 
unpredicted direction at the .12 level by a two- 
tailed test. A test of the expectation that Father- 
Boss would correlate higher than Mother-Boss was 
confirmed in only 59 out of 112 instances, surpris- 
ingly enough. This perhaps is an illustration of a 
relative unimportance of sex as a source of attitude 
generalization, seemingly found in a study of judg- 
ments of photographs (2). 


Summary and Discussion 


The most dependable empirical relationship 
found is for descriptions made of Father and 
Boss to be more similar than those made of 
Father and Subordinate ($ < .003). This may 
tentatively be interpreted as supporting the 
notion of a common attitude toward ‘authority 
encompassing both Father and Boss, or evi- 
dence of stimulus generalization. Supporting 
this interpretation is the tendency for Father 
and Boss to correlate more highly than Boss 
and Younger Brother (p < .08), and Boss 
and Older Brother (p < .13). Inexplicably, 
Father and Boss descriptions are no more 
similar than Father and Peer descriptions. In 
the absence of any single integrated explana- 
tion of the findings, these data are inter- 
preted as supporting, with exceptions, the 
initial hypothesis. The differences betwee? 
correlations are, however, very small. 

The study failed to find any evidence for 
generalized attitudes toward subordinates (in- 
cluding younger brothers), family, or work 
colleagues. 
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Specific Behavior Changes Following Chlorpromazine 


S. D. Porteus? 


Territorial Hospital, Kaneohe, Hawaii 


Though there is no dearth of studies on the 
effects of chlorpromazine with psychotic pa- 
tients, few of them report the data in quanti- 
fied form, or deal with specific alterations of 
behavior. Many investigators give the percent- 
ages of cases slightly, moderately, or markedly 
improved, but it is difficult to determine the 
basis of these judgments. Under such circum- 
stances, it is possible for bias to operate, and 
in the case of the “tranquilizing” drugs, as 
with other new approaches to complex prob- 
lems, such bias unfortunately exists. Medical 
opinion seems to range from enthusiastic ap- 
proval to frank scepticism, but, without re- 
sèarch data, either attitude is unjustifiable. 

Rather heroic efforts have been made by 
some devisers of rating scales of behavior to 
objectify their observations. Untidiness, for 
example, has been rated in terms of use of 
eating utensils or specific breakdowns in good 
toilet habits; aggressiveness in terms of num- 
ber of verbal or physical assaults, or indi- 
vidual sedations, or seclusive periods neces- 
sary. Ideally, such scales are superior to those 
that make use of generalized subjective char- 
acterizations. But, from the practical stand- 
Point, they require more detailed and faithful 
reporting by psychiatric aides, or more super- 
visory checks, than the limited staffs of many 
state hospitals can afford. Moreover, the cata- 
loguing and recording of overt incidents may 
give only the bare outlines of the behavior but 
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none of the shading. And as we well know, 
an impressionistic painting, even if somewhat 
blurred, may give a far more adequate picture 
than a line drawing. More concretely, an ag- 
gressive patient may never strike a blow; ex- 
treme depression may be often judged by what 
a person does not do; and the same is true of 
negativism, or withdrawal. 

Moreover, one may start out with a theo- 
retically excellent research design and a fine 
system of checks and balances, but it is an- 
other problem to keep the system in opera- 
tion. Patients drop out of the experimental 
group, supervisory personnel changes take 
place; so that the initial plans for a long- 
term study must undergo alteration. We have 
found it wise to keep the research tools as 
simple in operation as possible. 


Need for Specific Information 


We need specific information as to what 
behavior traits have been changed through 
drug administration, and to what degree, and 
in which directions. It would be idle to as- 
sume that all types of behavior are equally 
improved by the use of a drug such as chlor- 
promazine. It is this neglect to analyze im- 
provement which is responsible for an incor- 
rect labelling of the drugs as “tranquilizing” 
or “ataractic.” If it can be shown that in some 
ways the effect is stimulating rather than de- 
pressant, then the descriptive term equilibrat- 
ing or stabilizing would be more appropriate. 
Rating behavior by separate traits would seem 
to be a necessary complication of the research 
design. 

An important aspect of drug therapy is the 
time factor. It is important for both the phy- 
sician and the research worker to have more 
information as to the course of improvement, 
particularly with regard to possible plateaus 
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or even regressions. Such levelling-off periods 
and regressions are of common occurrence in 
the learning process, and it is only reasonable 
to suppose that they also occur in physiologi- 
cal habituation. According to our experience, 
plateaus do occur when the course of both in- 
dividual and group improvement are plotted. 
They may be best considered as not only re- 
lated to physiological adjustment of patients, 
but to the suggestibility of both observers and 
patients. 

Obviously, treatment and rating of effects 
should be continued long enough to span pe- 
riods of alternating rapid and retarded im- 
provement; otherwise, the treatment may be 
too enthusiastically accepted on the basis of 
initial gains, or regarded as ineffective be- 
cause of patients’ inability to sustain improve- 
ment. The profile of progress, both individual 
and group, does exhibit these ups and downs, 
but whether they follow a regular rhythm can 
only be determined by further research. The 
effect of these fluctuations on suggestibility 
will be discussed later. 


A Ward Behavior Scale 


Our method of rating behavior changes fol- 
lowing routine chlorpromazine treatment has 
been by means of a specially devised graphic 
rating scale. The scale covers eleven traits or 
trait complexes: aggressive or destructive be- 
havior, negativism or lack of Cooperation, 
speech disorders, untidiness, restlessness or 
physical over-activity, hallucinations, delu- 
sions, emotionality, mental confusion, degree 
of asocialization and compulsive behavior. 
The ratings extend from mildness or nonap- 
pearance as marked on the left of a six-inch 
line, up to very excessive appearance on the 
extreme right. 

The fact that speech disorders, emotion- 
ality, and socialization are bipolar necessitates 
the use of alternative scales, one for elation 
and one for depression in the case of emo- 
tionality, one for mutism and the other for 
loquacity in speech’ disorders, one for social 
withdrawal and the other for obtrusiveness 
in socialization. This avoids the difficulty of 

making ordinary behavior the midpoint, and 
then marking the bipolar extremes on either 
side, a system which would prevent the sum- 
mation of rating into total scores. Under our 
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scheme, the graphic ratings can be converted 
into numerical equivalents and the scores 
summated. 


Subjective Judgments of Behavior 


Rating scales are often down-graded in psy- 
chological estimation because of their subjec- 
tivity. Their devisers are plagued with prob- 
lems similar to those that beset the path of 
those who arrange intelligence scales. We are 
inclined to dignify the last named by calling 
them objective measures when we mean the 
measures are objectively applied. This confers 
the advantage that two examiners should get 
approximately the same results. But this 
agreement does not mean that the measures 
they apply are therefore the more valid. It is 
well to know that the two examiners may 
agree, but they may agree in being wrong. 

The basic problem lies in the fact that 
neither intelligence nor behavior is unitary; 
both have many aspects or facets, The men- 
tal tester is wont to assume a generality of 
reference for his tests that is greater than the 
evidence of validity warrants. The test de- 
viser, like the inventor of a behavior rating 
scale, assumes that all the subdivisions or 
items of his scale have equal importance. 
Wechsler, for example, states that “the best 
assumption to make about the separate tests 
of the (Wechsler-Bellevue) scale was that 
they were equally important” (5, p. 116). It 
would have been more correct to use the wor 
“easiest” in place of “best.” When we add to- 
gether all the item ratings into a total index 
of behavior we are making a similarly un- 
Provable or unproved assumption. 

However, it should be remembered that this 
score is not to be used like an IQ for deter- 
mining an individual’s comparative status, but 
rather for measuring changes in his behavior. 
The rating-scale approach has obvious defects, 
but like the intelligence scale it is the best 
measure at present available. Nor should it 
be forgotten also that the so-called objective 
tests were originally validated against subjec” 
tive judgments of mental ability. 

The descriptive terms used in our scale ar 
well within the comprehension of psychiatri 
aides, but their interpretation is assiste 
use of a guide. We have found it best to at 
on aides’ judgments rather than on those 
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more highly trained professional personnel. 
Cumming and Cumming (1) have recently 
Cited some evidence that aides’ predictions of 
Teadmissions of discharged patients are more 
reliable than those of others much higher in 
the Supervisory chain of command. If in this 
respect they do better than psychiatrists, their 
More intimate social contacts with patients 
should make them better judges of specific 
ward behavior. 


Application of the Scale 


In our study, aides were given practice ses- 
Sions in rating several patients and asked to 
Support their judgments as far as possible by 
objective observations. Aides who betrayed a 
tendency to make stereotyped judgments as 
shown by excessive grouping of ratings at 
some particular position on the scale were not 
used in this research. It soon became appar- 
ent that certain individuals of long experi- 
ence showed exceptional balance and that 
nothing was gained by pooling their ratings 
with those of less reliable judges. In the end, 
two aides only were selected as raters though 
they often consulted with others on the ward. 

Another question which arose at the outset 
of the investigation was that of controls. In 
our research design, two closed wards for 
chronic psychotic males were selected for the 
experiment, and it was decided to give 300 mg. 
of chlorpromazine daily to all 80 patients on 
Ward H and placebos to an equal number on 
Ward I. As a matter of practical convenience, 
this plan seemed better than using half the 
patients on each ward as controls. It is so 
much easier to give the drugs or placebos 
from a single bottle three times a day, = 
to have two groups of individuals listed, an 
lined up so as to receive their pulvules pe 
the right bottle. Supervision would hate 
more difficult if each patient had his indi- 


vidual bottle. Cpe! 
lations of the two wards we 
rit e. Patients in each ranged 


generally comparable. 
in aseoologitl age from about 25 to over 80 


f residence, between the ex- 
od ee 57 years. Both wards a 
tained a wide variety of psychiatric ope 
Majority being schizophrenics. Most ha be 
Subjected to all kinds of psychiatric t- 
ments without lasting benefit. They repr 


sented all ethnic groups resident in Hawaii— 
Caucasian, Japanese, Chinese, Koreans, Fili- 
pinos, Puerto Ricans. In short, the subjects 
could be considered, except from the racial 
angle, typical of the population of the “back 
wards” in any state mental hospital. Only 50 
patients were finally included in each of the 
experimental and control groups, but their 
names were chosen at random. Because so 
little was known of the limitations of the drug, 
Dr. Kimmich, the medical director, decided 
that the most democratic procedure was to 
give all the patients on Ward H the advan- 
tages, if any, of medication. Two ratings with 
a three-week interval were collected on each 
patient before the drug was administered, and 
the scores were averaged to give the premedi- 
cation rating. The drug was then adminis- 
tered over an eighteen-week period. Finally, 
to smooth out the effects of fluctuations in 
improvement, it was decided to average the 
postmedication ratings in pairs, the 3- with 
the 6-week, the 9- with the 12-week, and the 
15- with the 18-week scores. 

The double-blind system of keeping both 
aides and patients in ignorance as to which 
ward was receiving the drug and which the 
placebos was adopted, but it was not very 
long before the efficiency of any control sys- 
tem based on placebo administration came 
under serious question. In the first place, 
after the first six weeks, the comparative ab- 
sence of side effects, or of any definite trends 
as regards changed behavior on the placebo 
ward made the true situation very plain, thus 
robbing the plan of its double-blind charac- 
ter. But this is not the most serious objection 


to the placebo procedure. 
Suggestibility and the Placebo Effect 


Adopting placebo administration as part of 
the research design is intended to diminish 
the effects of suggestion. The essence of any 
control plan is that the two groups should be 
equated, but as far as we know, there is no 
way of dividing patients into groups on the 
basis of their suggestibility since we have no 
adequate measure of this trait at hand. 

Whenever an individual is given any form 
of therapy, including medication, with the 
statement that it will do him good, the ex- 
pectation thus set up tends to be fulfilled. 
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This fulfillment differs in degree with differ- 
ent people. Moreover, it affects raters’ judg- 
ments also. If the initial impact of the drug is 
strong, their tendency is to give patients bet- 
ter ratings than are justified. If the effect lev- 
els off or subsides, then the disappointed ex- 
pectations of both patients and raters are re- 
flected in ratings lower than they should be. 
Suggestion also plays an important part when 
placebos are administered, so much so that 
their use as a method of control is futile. 
Rosenthal and Frank (3) in a discussion of 
the placebo effect point out that successes in 
almost every form of therapy depend to an 
undetermined extent on this factor. As exam- 
ples, they cite two studies in which placebos 
brought about a very marked diminution of 
distress symptoms. Only 55 per cent of cases 
reported a reduction in the incidence of com- 
mon colds after vaccine treatment as against 
61 per cent of those receiving placebos. Simi- 
lar placebo effects were reported for a variety 
of complaints, including distress from ulcers, 
surgical wounds, and chronic headaches, Un- 
favorable or even toxic side-effects have oc- 
casionally occurred with placebos. Rosenthal 
and Frank in their own study of placebo ef- 
fects found that 69 per cent of their cases 
showed decreased blood pressure, 19 per cent 
an increase, while pulse rate rose in 25 per 
cent after placebo administration. They sum 
up the implications of these results as follows: 


We need to learn more about the nature of the 
placebo effect, the conditions giving rise to it, and 
the attributes of patients most susceptible or resist- 
ant to it, so that we may obtain a better understand- 
ing of the role of the non-specific factors in psycho- 
therapy (3, p. 300). 


In default of this knowledge we have not 
plotted nor reported our placebo results ex- 
cept in one table. We would also point out 
that the method of giving one group of pa- 
tients on a ward chlorpromazine and giving 
another group nothing does not control nega- 
tive suggestion. A very common and impor- 
tant inference would be that patients who re- 
ceive no treatment would very likely get worse. 


Results 


Figure 1 is a composite graph showing the 
course of changes in 9 of 11 traits of our 
scale, with the average of two premedication 
ratings shown at the left. The other three 
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Fig. 1. Changes in 50 patients in behavior trait 
ratings, after chlorpromazine. 


points of the graph give the averages of the 
3- to 6-week, the 9- to 12-week, and the 15- 
to 18-week ratings. Two traits are not rep- 
resented, one being asocialization (withdrawal- 
obtrusiveness), which showed no change, and 
compulsive behavior, the course of which was 
so irregular as to cast doubt on the validity 
of the ratings. The figure is divided into two 
halves so as to avoid overlapping of lines. It 
should be remembered that the points marked 
are averages of ratings for the whole group of 
50 patients, and that the higher the vertical 
Position the more severe the manifestation of 
the trait in question, while the steepness of 
the slope to the right indicates degree of rated 
improvement. Zero on the graph means ab- 
Sence of the trait, and as could be expected 
this is only approached in two distinctively 
psychotic behavior traits, hallucinations a" 

delusions. Since these are rated inferentially 
rather than by their more overt manifestation, 
this approach to zero must be interpreted cau- 
tiously. It means disappearance of the sy™P 
tom rather than nonexistence or complete 
cure. , 

It is worth noting that mental confusio" 
and speech disorders are relatively the i 
ameliorated by chlorpromazine and are PY° 
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ably thé closest related to intellectual proc- 
€sses. Here is indirect evidence that the drug 
acts more on the basal ganglia or the cortico- 
thalamic arcs than on the cortex. The analogy 
between lobotomy and chlorpromazine effects, 
Particularly decline in vigilance and planning 
as measured by the Porteus Maze Test, has 
Tecently been discussed in the pages of this 
Journal (2). Self-protective alertness seems 
thus to depend on intact corticothalamic nerve 
Pathways. In passing, it may be mentioned 
that continued research at the Kaneohe Hos- 
Pital confirms the discovery that serious losses 
in Maze performance accompany prolonged 
administration of chlorpromazine on a 300 
mg. daily dosage. 

The fact that changes in restlessness and 
aggressive behavior as shown by the graph 
Parallel each other so closely indicates an in- 
teresting relationship. Internal stress may ex- 
Press itself in either form of behavior. That 
improvement in these traits is so steady is 
one of the reasons for calling chlorpromazine 
a tranquilizer. But the fact that negativism 
and mutism are also improved shows that the 
drug stimulates as well as depresses. 
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and individual profiles of behavior 
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Emotionality, which as observed, was mainly 
depression, has a rather irregular course. There 
is an initial worsening during the first six 
weeks of medication followed by a marked 
improvement in the second six weeks, with a 
tendency to decrease or level off in the third 
six weeks. Negativism and untidiness follow 
similar courses at about the same psychotic 
level, which also might be expected. 

It may be that vertical height on the graph 
represents not so much severity of psychosis 
as the facility with which the traits may be 
observed by psychiatric aides. There is no 
doubt about the existence of mental confusion 
in a patient, while the overtalkative and the 
mute easily attract attention. On the other 
hand, the most difficult traits to rate are de- 
lusions and emotional states, especially de- 
pression, since their manifestations are less 
overt. The initial deepening of depression, 
like the early sharpening of delusions and 
hallucinations have been noted in general 
fashion in other studies. 

Figure 2 is also a composite of four graphs. 
The first gives the over-all picture for the 
whole group as regards total scores averaged 
at the same points or time intervals as be- 
fore; the other three are individual profiles 
that seem typical of the various kinds of 
behavioral change. Case 32 exhibited steady 
and marked improvement during the first 12 
weeks, followed by a period of somewhat 
slower progress. Case 31 showed the same 
marked improvement after a slower begin- 
ning and ended in a plateau or slight regres- 
sion. Another type of change is that illus- 
trated by Case 34 who showed an initial 
regression during the first six weeks of medi- 
cation followed by a dramatic improvement 
which resulted in discharge from the hospital. 
As will be seen, the profile of Case 32 ap- 
proaches most nearly the group results. A 
fourth type characterized by initial regres- 
sion, then improvement, and later regression 
was not plotted as being too irregular to in- 
dicate any trend. 


Case Histories 
The histories of the three patients whose 
courses of improvement are charted have been 
summarized as follows: 


The hospital record of Case No. 31 states that at 
13 years of age this patient suffered head injuries 
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from a fall, and that thereafter his memory was 
affected. In 1946-48 he was treated at other hospitals 
with 40 electric shock treatments (ESTs) recorded. 
His IQ was reported as 96. He was experiencing 
visual and auditory hallucinations and complained 
of sexual troubles. In November, 1950 he assaulted 
another patient and had ESTs at irregular intervals 
up to 1951. In February, 1956 he went on chlor- 
promazine treatment and during the experimental 18 
weeks progressed from a rating of 29 to 16 points, 
rated as moderate improvement. His Maze scores 
were successively 153, 124, 7, 6 years, a dramatic de- 
cline. He is still on the drug, 300 mg. daily. In the 
third six weeks of medication improvement reached 
a plateau. The most marked changes took place in 
aggressiveness and hallucination. er 
Case No. 32 was admitted to the hospital in Feb- 
ruary, 1954 and for two months was put on serpasil. 
His diagnosis was schizophrenic reaction, paranoid 
type. He was reported to be destructive, untidy, ag- 
gressive, with incoherent, rambling speech, marked 
psychomotor activity, no insight nor judgment, and 
assaultive. Later in 1954 he was autistic and disor- 
ganized with delusions of grandeur. In 1956 he was 
given 63 insulin treatments with 57 comas, and 17 
ESTs. On chlorpromazine from February, 1956 he 
showed considerable improvement, being reported as 
cooperative and cheerful. From an initial score of 26 
his rating fell to 11, a change of 15 points which 
puts him in the “marked improvement” class. The 
course of decline in psychotic behavior was steady. 
Case No. 34, a veteran of Japanese ancestry, was 
hospitalized at Lebanon Hospital in 1946 with the 
diagnosis of schizophrenia, affective type. He devel- 
oped delusions of grandeur, had a persecutory com- 
plex, and was destructive and combative. The mother 
was said to be overprotective. He married a girl 
in Wisconsin, had three children, and was divorced. 
Auditory hallucinations with strong delusions and 
ideas of reference appeared at the end of 1954. He 
was given 31 insulin comas and 16 ESTs. At one 
hospital he underwent 12 ESTs and 50 insulin comas. 
The staff psychiatrist noted: “My recent observation 
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of the patient is that he is worse despite all the 
treatment.” Thorazine treatment was begun in Oc- 
tober, 1955, and the dose was increased in June, 
1956, to 600 mg. daily. He then changed from a be- 
havior rating of 28 in February to 33 points in six 
weeks, but later insight improved, followed by a 
rapid improvement to a 2 point rating, and ultimate 
discharge. He is working as a restaurant cook. His 
premedication Maze score was 14 years which, when 
examined six months later, rose to 14} years, In 
spite of the rise in score his second test was some- 
what similar to some lobotomy patients. His intelli- 
gence scores at Lebanon were Wechsler-Bellevue Full 
scale IQ 93, Verbal, 97, Performance, 89, For two 
months in 1955, he was given serpasil. His improve- 
ment cannot therefore be related solely to chlor- 
Promazine though it was finally the effective agent 
im recovery. Because of the unusual Maze improve- 
ment without any decline, it will be interesting to 
note his further social history. Ward behavior im- 
Proved in every respect except asocialization. 


Numerical Presentation of Results 


Finally, the aides’ graphic ratings have 
been converted into numerical equivalents 
and the changes have been grouped in five- 
Point divisions and presented in Table 1. To 
allow for possible rating errors, changes from 
0 to 4 points have been regarded as negligible, 
while changes from 5 to 9 points of the scale 
are considered slight or insignificant. Changes 
from 10 to 14 points are classified as moder- 
ate, 15 to 19 as marked, 20 and above as 
representing very great improvement. The av- 
erage change in all 50 cases is approximately 
12 points, so that to cut off changes up to 9 
Points as being insignificant is a very con- 
servative procedure. 

On this basis, 60 per cent of these chronic 


Table 1 
Ward Behavior Changes Summarized 


(N = 50 in each group) 


Rating Point Controls Chlorpromazine Group 
= 
Changes Worse Better Worse Better 

Oto 4 (Negligible) 16.6% 16.6% 4.0% 12.0% 

5 to 9 (Slight) 5.5% 44.5% "940% 
10 to 14 (Moderate) 5.5% 8.3% 26.0% 
15 to 19 (Marked) 3.0% 26.0% 
20 plus (Very marked) 8.0% 
Above 9 (Significant) 55% n3% 00% 


LL eres 
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male psychotics showed significant improve- 
ment as against 11 per cent of the placebo 
8roup. It would be fair to say that over and 
above the placebo effect, 49 per cent, or ap- 
Proximately half the group, were obviously 
improved in ward behavior after their daily 
doses of 100 mg. of chlorpromazine adminis- 
tered over a period of 18 weeks. Considering 
the mental condition and hospital history of 
the group, this must be considered a very re- 
markable result. It would certainly justify the 
Opinion of Tarumianz (4, p. 92) that the 
“ataractic” drugs have ushered in a new era 
in psychiatry. 

If we abandon this conservative attitude 
and consider any decline in score as improve- 
ment, then 96 per cent of the chlorpromazine 
Patients showed some degree of improvement 
as against 60 per cent of the placebo group. 
The writer, however, considers reporting in 
such terms misleading. The percentages are 
quoted merely to show perfect accord with 
the statement by Rosenthal and Frank that 
improvement in neurotics with various forms 
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of psychotherapy hovers around 60 per cent, 
while the same percentage is reported for 
placebo effects in colds and headaches. Since 
the placebo is of so little value as a control 
device, results following its use have not been 
plotted in this study. 


Received February 25, 1957. 
Early Publication. 
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Use of the Semantic Differential with Lobotomized 
Psychotics’ 


Catherine B. Semans? 
Athens State Hospital 


This study undertook to use a form of Os- 
good’s semantic differential (1, 2) With se- 
verely ill psychotics at Athens State Hospital, 
Athens, Ohio, who were candidates for trans- 
orbital lobotomy. 

Ten concepts, selected to represent areas 
presumably affected by transorbital lobotomy, 
were rotated against 15 pairs of adjectives, 
12 of which were identical with those used by 
Osgood to cover the good-bad, active-passive, 
strong-weak, and pleasant-unpleasant dimen- 
sions of meaning. Each of the resulting 150 
items was arranged with a 7-point rating 
scale. Each patient was tested individually 

and was asked to rate each concept according 
to his feelings. 


The patient was asked to place a check mark 
to show whether he thought himself as “very 
fast,” “quite fast,” etc., to “very slow.” The 
same procedure was followed for all of the 
150 items. Time required to complete the rat- 
ings was recorded along with behavior notes, 
Half of the patients completed two ratings be- 
fore operation and one after, while the other 
half completed one rating before and two 


1An extended report of this study may be ob- 
tained without charge from Mrs. Catherine B, Se- 
mans, Chief Psychologist, Athens State Hospital, 
Athens, Ohio, or for a fee from the American Docu- 
mentation Institute. Order Document No. 5184, re- 
mitting $1.75 for microfilm or $2.50 for photocopies. 

2 The author wishes tu thank Dr. George Klare, 
Ohio University, for his supervision, suggestions, and 


criticisms. 


after. Only 15 of 32 lobotomized patients were 
able to complete the ratings, the others being 
too withdrawn, belligerent, or ignorant to co- 
operate. Seven patients who were candidates 
for the operation, but whose families did not 
give permission for it, formed a small control 
group. 

It was found, first, that significantly greater 
change in Concept ratings occurred when op- 
eration intervened between repetitions than 
when ratings were repeated without interven- 
ing operation. Second, time spent in complet- 
ing the rating scale was very greatly reduced 
following operation; this result suggests the 
need for caution in using time measures alone 
for indication of change following lobotomy. 
Third, atypical performance on the semantic 
differential may be related to pathological re- 
sults of lobotomy which are not immediately 
apparent clinically. Fourth, the semantic dif- 
ferential needs to be simplified for use with 
severe chronic psychotics, but can be success- 
fully used in its present form with psychotics 
in reasonably good contact as a measure of 
attitude changes. Exploratory work is being 
done in developing simplified forms. 


Brief Report. 
Received November 21, 1956. 
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A Scale for Measuring 


Minimal Social Behavior! 


Amerigo Farina, David Arenberg, and Samuel Guskin 


Veterans Administration 


Chronic deteriorated patients have been re- 
Ceiving increased attention since the advent of 
the tranquilizing drugs. The problem of evalu- 
ating the effects of these drugs on this type of 
Patient has become acute. Recently the au- 
thors were called upon to evaluate the thera- 
peutic effects of a new drug, promazine, on 
such patients (4). The existing scales seemed 
ill-suited for use at this very low level of 
functioning. è 

A scale was constructed on the assumption 
that inappropriate interpersonal behaviors are 
an important facet of serious psychiatric ill- 
ness. The items selected for this scale are in- 
tended to tap those habit patterns which are 
disrupted only in the extremes of pathology. 
It is assumed that clinical improvement will 
include the reinstitution of such simple con- 
ventional behaviors as shaking hands. A brief 
standardized format designed to look like an 
interview was selected for the administration 
of the scale. It was considered a convenient 
w well the patient can deal with 


people more generally. The examiner can ad- 
minister the scale quickly and without previ- 
ous acquaintance with the patient. 


sample of ho 


Minimal Social ‘Behavior Scale * 


The scale is administered in a room con- 
taining a desk, two chairs, a waste paper 


to Drs. Warren W. 
and to Mr. Paul B. 


1The authors are indebted 
in completing this 


Webb and Robert A. Wagoner, 
Fiddleman for advice and help 


study. 


2 A guide for the administration and scoring of the 


scale and also individual item data have heen Bre: 
pared and are available upon request. Items us 
32 were added subsequent to the pronaos s ae 
A 16 mm. sound training film Oa ae 
i i inisteri scorin| 
signed as an aid in administering ani i S 
b available. For further details write to senior au 


thor. 


Hospital, Roanoke, Virginia 


basket, and nothing more. The patient’s score 
is the total number of items scored plus. 


1. An aide brings the patient to the open door and 
introduces him. Score + if the patient enters and ap- 
proaches the examiner. 

2. The examiner says, “Hello, Mr. .” Score + 
any discriminable response to the greeting, verbal or 
otherwise. 

3. Score + if the response to the greeting is verbal 
and appropriate. 

4. The examiner extends his hand. Score + if the 
patient shakes hands. 

5. The examiner says, “Won’t you have a seat?” 
Score + if the patient sits without further urging. 

6. The examiner says, “How are you today?” Score 
+ any discriminable response to the question, verbal 
or otherwise. 

7. Score + if the response to the question is verbal 
and appropriate. 

8. The examiner drops a pencil by pushing it off 
the edge of the desk, ostensibly by accident. If the 
patient does not pick up the pencil spontaneously, 
the examiner says, “Will you please pick up that 
pencil for me?” Score + if the patient picks up the 
pencil. 

9. Score + if the patient picks up the pencil spon- 
taneously. 

10. The examiner says, “Would you mind moving 
your chair closer?” Score + if the patient moves the 
chair closer to the examiner. 

11. The examiner holds in front of the patient a 
drawing of a three-inch square with diagonals. The 
examiner says, “I have something I want to show 
you.” Score + if the patient looks at the drawing. 

12. The examiner offers a pencil to the patient and 
says, “Here is a pencil.” Score + if the patient ac- 
cepts the pencil without further urging. 

13. The examiner places a pad of blank paper in 
front of the patient and says, “I would like you to 
copy this drawing on this paper.” Score + if the pa- 
tient makes any mark on the paper. 

14. Score + if the patient draws any four-sided 
figure with diagonals and nothing more. 

15. The examiner proffers an opened pack of ciga- 
rettes to the patient and says, “Cigarette?” Score + 
any response which indicates acceptance or refusal. 

16. The examiner says, “How are you getting 
along?” Score + any recognizable response to the 


question, verbal or otherwise. 
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17. Score + an appropriate verbal response to the 
question. : 

18. The examiner crumples a sheet of paper, tosses 
it at the waste basket, purposely missing, and says, 
“Damn it, missed again.” Score + if the patient re- 
sponds with a smile or a laugh. 

19. Score + if the patient spontaneously picks up 
the paper and deposits it in the waste basket. 

20. Score + if the patient makes any verbal re- 
sponse, irrespective of content, to all of the follow- 
ing questions: 


(a) What year is it? 

(b) What month is it? 

(c) What day is it? 

(d) What season is it? 

(e) Where are you? 

(f) What is the nearest city? 

(g) Who are some people you know around here? 
(h) Do you hear voices now? 

(i) Is anybody making trouble for you? 


21. The examiner places a magazine, such as Life, 


in front of the patient and says, “PI be busy a 
minute.” The examiner busies himself with paper 


work. Score + if the patient turns at least one page 
of the magazine. 
22. The examiner feigns a headache by rubbing 


his head with his hands, assuming a pained expres- 

sion and ostensibly attempting to shake off the pain. 

Score + any verbal response which includes the con- 

tent of “head” or “pain.” 

23. The examiner places a cigarette in his lips and 
fumbles for matches. He stands up and pats his 
pockets. A book of matches has Previously been 
placed within easy reach of the patient. Score + if 
the patient offers or calls attention to the matches 
or offers a light from his own cigarette. 

24. The examiner rises and offers his hand, saying, 
“Thank you very much, Mr. -” Score + if the 
patient rises from his chair. 

25. If necessary, the examiner says, “Go ahead, 
Mr. ,” indicating the door. Score + if the pa- 
tient opens the door and crosses the threshold with- 
out further urging. 

26. Score + unless inappropriate 
nerisms are readily apparent. 

27. Score + if the patient at any time looks the 

‘examiner in the eye. 

28. Score + unless the patient obviously appears 
to avoid the examiner’s gaze at any time, or stares 
at the examiner fixedly. 

29. Score + unless the patient sits in a bizarre 
position, is in constant motion, or is nearly motion- 
less. Do not confuse with item 26. 

30. Score + unless the patient’s clothes are obvi- 
ously disarranged, unbuttoned, or misbuttoned. 

31. Score + unless the patient is obviously drool- 
ing or nasal mucus is clearly visible or unless food 
deposits are conspicuous on clothes or face. , 

32. Score + unless the patient rises from his seat 
and moves away from the examiner before the ter- 
mination of the interview without an explanation. 
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In the previous study (4), a high degree of 
agreement was obtained between two raters 
scoring the same sample of behavior. With 
the examiner interviewing the patient and an 
observer simultaneously rating from behind a 
one-way screen the correlation was .96 (N= 
15). A test-retest correlation of 87 (N = 13) 
was obtained when an examiner retested the 
same patients after a seven-week interval. Also, 
the scale was able to detect, at a statistically 
significant level, changes which, presumably, 
resulted from the treatment program. The 
thirty items were examined for internal con- 
sistency; none was found to be in disagree- 
ment with the total score. The data from the 
Promazine study, then, indicated high inter- 
rater agreement and stable test-retest scores, 
and suggested that minimal changes were 
measurable using this scale. 

This paper is a report of a more thorough 
investigation of the scale itself, The primary 
aims were to obtain an indication of the va- 
lidity and particularly to determine its reli- 
ability with patients functioning at a low 
level of adjustment. A further aim was to 
compare the reliability obtained with such 
patients with that of another scale in current 
use. The scale selected for this purpose was 
the Hospital Adjustment Scale (HAS). The 
HAS is an inventory listing ninety ward be- 
haviors. It is designed to be filled out by aides 
on the basis of prolonged contact with the pa- 
tients. 


Method and Procedure 


Four groups of male patients were selected 
in order of increasing pathology. Groups A, B, 
and C were all from one ward housing chronic 
patients and were selected by the ward psy- 
chologist, Group D was selected from a ward 
where the patients were so regressed that they 
required considerable nursing care. Group A 
(V=5) was composed of patients consid- 
ered nearly ready to go home. Group B (W 
= 15) consisted of more seriously ill patients. 
Group C (N = 20) was composed of the least 
adequately functioning patients on the ward 
Group D (N = 20) was selected from another 
ward where, as noted, the patients were eve? 
less able to care for themselves. Pertinent 1 
formation about the groups is summarized i? 
Table 1. 
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Table 1 


Summary of Pertinent Information About 
Patient Groups 


Grades Years current 

Age completed hospitalization 

Group Mean SD Mean SD Mean SD 
A 283 54 9 138 31 ta 1.2 
B 34.9 5.9 9.5 2.9 58 3.6 
C 35.0 5.7 7.8 3.8 93 54 
D 37.2 4.3 77 26 96 44 


The Minimal Social Behavior Scale (MSBS) 
and the HAS were both administered to all 
four groups. In addition both scales were re- 
administered approximately one week later on 
groups B, C, and D. For each patient differ- 
ent interviewers were involved in the read- 
ministration of the MSBS, and different aides 
filled out the second HAS. Furthermore, in 
the second MSBS interview groups B and C 
were also rated simultaneously by an observer 


from behind a one-way screen. 


Results and Discussion 


The correlation between the interviewer 
scores and the scores obtained by the observer 
from behind the one-way screen was .95 (N 
= 35). This agrees closely with the correla- 
tion of .96 previously obtained with the 30- 
item scale (W = 15). Excellent interrater 
agreement is indicated. , , : 

Correlations between different interviewers 
rating the same patients one week apart are 
given in Table 2. Comparable correlations ob- 
tained by different aides independently rating 


i ith the HAS are also listed. 
gge once.) The lowest 


l 

(Group A was rated only 

test-retest correlation for each scale Pag 
for group D. It was expected that hel 
would be less well suited for patients func- 


Table 2 
Interexaminer Correlations 
Group MSBS HAS 
61 

.80 Zl 

k 84 ra 
63 r 

D k 


* Difference significant at the .05 level. 
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Table 3 
Group D Mean Scores and Differences 
MSBS HAS 
Raters 1 18.50 26.70 
Raters 2 18.80 16.05 
Difference 30 10.65 
t 34 3.58 
$ Not 01 
significant 


tioning at a very low level, and this seems to 
be the case. The lower reliability obtained 
with the MSBS on these subjects might sug- 
gest that it, too, is a less effective measure at 
this level. However, as noted above, in the 
previous study (4) a test-retest reliability 
with a similarly pathological group yielded a 
correlation coefficient of .87 (N = 13). These 
differences between the MSBS and the HAS 
correlations are not statistically significant 
with samples of this size. However, the data 
suggest that the MSBS may be more reliable 
with patients functioning at this low level. A 
question about the suitability of the HAS at 
this level is raised by the disparity between 
the two means obtained on group D by the 
two aides (see Table 3). The difference is sig- 
nificant at the .01 level. In contrast to this 
the difference in mean scores obtained by the 
MSBS raters is minimal. At this low level of 
adjustment the MSBS seems to be a more re- 
liable instrument than the HAS. 

The means for the initial ratings on the 
four groups are given in Table 4. An analysis 
of variance among these four groups resulted 
in an F ratio of 9.07, which indicates that 
these groups differed significantly at the .001 
level of confidence. Furthermore, the means 
decreased from groups A through D as pre- 


Table 4 
Means for Initial MSBS Ratings 
Group N Mean 
A 5 29.8 
B 15 24.6 
C 20 21.9 
D 20 18.6 
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dicted, and this sequence would occur by 
chance less than five per cent of the time. 
This indicates that the scale is capable of 
discriminating various degrees of pathological 
adjustment on a group basis. In general, the 
data indicate that the MSBS may have some 
utility when used with patients who are func- 
tioning at a low level of adjustment. 


Summary and Conclusions 


A scale was constructed to measure social 
behavior in chronic, deteriorated patients. It 
was administered to four groups of patients; 
the groups represented four different levels of 
psychopathology. The scale differentiated the 
four groups significantly and the means de- 
creased as hypothesized. Some comparisons 
between the MSBS and the HAS were made 
with the group at the lowest level of adjust- 
ment. With this group the correlations be- 
tween repeated measures seemed to favor the 


Amerigo Farina, David Arenberg, and Samuel Guskin 


MSBS. The two aides rating this group with 
the HAS obtained significantly different mean 
scores. The mean MSBS scores were much 
less disparate. The scale appears to have some 
utility when used with patients at a low level 
of adjustment. 


Received August 20, 1956. 
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Observer Reliability of Interaction Patterns 
During Interviews’ 


Jeanne S. Phillips, Joseph D. Matarazzo, Ruth G. Matarazzo, 
and George Saslow ° 
Massachusetts General Hospital and Harvard Medical School 


_ This report is the fourth in a series of stud- 
les (9, 12, 13) dealing with the interview as 
an instrument for personality research. Much 
Previous research with the interview has shown 
that, while it yields rich psychodynamic and 
Psychotherapeutic material for some types of 
investigation, it is, when entirely uncontrolled, 
Nevertheless severely limited as a research 
tool for the study of behavior. Many investi- 
gators have reported little or no agreement 
between two or more interviewers when the 
same sample of subjects was interviewed and 
each individual was rated on predefined vari- 
ables such as anxiety (4), various types of 
defenses (11), or specific diagnosis (1, 7), 
etc. In view of the methodological shortcom- 
ings of the interview which these reports have 
highlighted, our group began to work with 
both a partially standardized interview (8), 
and certain well-defined interview-interaction 
variables, as measured by the Interaction 
Chronograph method. CO 

The aly standardized interview in- 
Volves certain “rules” for the interviewer to 
follow during each of five predefined subpe- 
riods of the interview. Periods 1, 3, and 5 
consist of free give-and-take interviewing, 
vhile Periods 2 (silence) and 4 (interrup- 
ion) involve stress phases of the rea 
Che principal variables recorded dme e 
tandardized interview are listed in Table 1. 
t is to be noted that there is no standardiza- 
ion of content in these interviews. 


igati d by a research 
1 This investigation was supported b h 
A oA the National Institute of Men. 


l Health, of the National Institutes of Health, 
r 7 


. S. Public Health Service. A 
New at Uaiverally of Oregon Medical School. 


The recorded variables include the number 
of interactions of both the subject (S) and 
interviewer; the duration of each action and 
each silence; the adjustment of the partici- 
pants to each other; the frequency of the S’s 
initiative-taking during the silence stress pe- 
riod; the frequency and duration of interrup- 
tions; the frequency of dominances and sub- 
missions, etc. A more complete description of 
the standardized interview, definitions of the 
above-mentioned variables, and a history of 
the development and underlying theory of the 
Interaction Chronograph method will be found 
in previous reports (8, 9, 10, 12, 13), 

To date our research has shown the follow- 
ing. First, there are wide individual differ- 
ences in interaction patterns among Ss. Sec- 
ond, the interviewee interaction variables for 
any given subject are highly stable across 
two different interviewers when the latter 
standardize their interviewing behavior (with- 
out standardizing the content of their inter- 
views), and at the same time, the variables 
are modifiable by planned changes in the 
intra-interview behavior of either interviewer 
(12). Third, the striking general stability and 
specific modifiability of interviewee interac- 
tion patterns which were found for our first 
sample of Ss could be cross validated in a 
second sample (9). Fourth, the stability and 
modifiability were equally striking when only 
a single interviewer was used and the test-re- 
test interval was extended to seven days (13), 
in contrast to the first two studies which em- 
ployed a test-retest interval of a few minutes, 

Before we proceeded with studies designed 
to investigate the “meaning,” or predictive, 
concurrent, and construct validity, of the In- 
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teraction Chronograph patterns, it became ap- 
parent that there was an important methodo- 
logical question which the previous studies 
had not fully answered; namely, what is the 
influence of the observer on the final inter- 
action pattern recorded for any given sub- 
ject? Although the previous three studies had 
yielded high test-retest reliability (stability) 
coefficients for the interaction variables, the 
fact that the reliability coefficients depend 
upon the input of the observer recording the 
two-person interactions raises the question of 
confounding. That is, for any given subject, 
the final interaction description yielded by the 
Interaction Chronograph method may be an 
accurate portrayal of the S’s “true” interac- 
tion during the interview, or it may also, to 
an unknown degree, reflect decisions, biases, 
or other response sets of the particular ob- 
server. Other currently used interaction sys- 
tems (6) which utilize a human observer dur- 
ing the actual ongoing interview or group 
interaction (in contrast to later analysis of 
typed transcripts), have given little attention 
to the important methodological question of 
the observer’s response sets. 

The present study was designed to investi- 
gate the reliability of the Interaction Chrono- 
graph observer’s recording. Other investigators 
working with the method, notably Chapple 
(2, p. 301) and Goldman-Eisler (5, p. 355), 
have recognized the importance of the ob- 
server’s input to the final interactional de- 
scription of the subject. However, no system- 
atic study of the observer has yet been 
reported. Since only one observer had been 
used in all our observations, the question of 
possible minimizing of interviewee variability 
through the observer’s own constant response 
sets must be examined. 


Procedure 


The availability of two Interaction Chrono- 
graphs in the personnel department of a large 
department store made possible simultaneous 
but independent recordings of the same stand- 
ardized interview by two observers (Os). 


3 The authors wish to express their appreciation to 
the members of the Personnel Department of Carson- 
Pirie-Scott, of Chicago; especially to its head, Miss 
Elizabeth Hatch, for her cooperation and assistance, 
and to Miss Louise Mistlebauer, who served both as 
coordinator and observer in this study. 
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One of the Os had had approximately two 
years of experience (involving many hundreds 
of interviews) observing the standardized in- 
terview in an employment setting. The second 
O was relatively inexperienced, having re- 
corded only some 10 practice interviews, and 
these in a psychiatric rather than a depart- 
ment store setting. Preparation of the two Os 
for the present study consisted of their re- 
viewing the structure of the partially stand- 
ardized interview and the rules for what con- 
stitutes scorable action and inaction. Experi- 
ence has indicated that a major difficulty 
arises for the O when he tries to decide when 
an interviewee communication unit (action) 
has terminated and inaction has begun. In 
order to surmount this difficulty and thereby 
make the observations more objective, certain 
tules have been established to aid in the de- 
cision of what is scored as inaction. These 
rules have been published (3, pp. 5-9; 8, pp- 
362-364) and were reviewed by the two OS 
before the study began. 

Following the joint review, standardized in- 
terviews were conducted by three experienced 
interviewers over a one-week period with 
seventeen randomly selected Ss. The seven- 
teen Ss, who were being routinely interviewed 
and evaluated by the personnel department; 
consisted of applicants for jobs and employees 
being considered for promotion. The inter- 
views were simultaneously but independently 
recorded by the two Os who sat in a small, 
totally dark room and watched the interview 
through a one-way window. An intercom- 
munication system was used to transmit the 
voices in all but the first three interviews, 
when mechanical failure forced the Os to use 
visual cues alone. Because of the darkness ° 
the Os’ room, the use of earphones, and the 
Os’ distance from the recording machines, 1° 
visual or auditory cues were available tO 
either O to indicate the other’s recording, 257 
suring independence of the two sets of oP” 
servations. 

Results 


Table 1 contains the means and sigmas; ‘ 
well as the Spearman rho and Pearson 7 7° 
liability coefficients for the nine major 10 a 
action Chronograph variables. Although mé f 
values (i.e., raw scores divided by mumbe as 
units for each S) are usually used as sco 


as 
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a alvina Ss, it was felt that these might Table 2 
scure differences between the results of the 
oera Observer Reliability for Individual Peri 

oo Therefore, individual raw scores were iri ge fap eg 
Sed in the computations. Mean values for the 

l . as ae ; 
17 Ss Ge, the mens acoc heni? indi- S’s Units S’s Action S’s Silence 
vidual raw scores) are presented for the total Period rho r rho r rho r 


Interview in Table 1. The reason for present- 
Ing values for both rho and z in Table 1 is 
made clear by the results with the variable 
S’s Adjustment. For this variable, due to the 
influence of only 3 cases, r is reduced to an 
insignificant value of .398, while rho, a meas- 


1.000 .998 958 .967 910.952 
476 889 . .988 .997 855.911 
972 .973 946 .970 .781 .840 
-710 .787 .994 .999 -763 .940 
.984 984 .892 .878 901 .956 


Ue U Ne 


p = .05, r = .482, rho = .49. 
$ = .01, r = .606, rho = .64. 


Table 1 


Observer Reliability for Total Interview - ure which is less sensitive to extreme devia- 


tions, yields a highly significant value of .710. 
Since we have been dealing with interaction 


Mean 
Re raw “a b measures the characteristics of which we are 
ariable score > SD T only now beginning to determine, and with 


SSS eS S—C—:SC*=“#EE * fi . 
S’s Units relatively small Ws, we have felt it wise to 
Obs. X 50.70 18.92 .962 .985 .01 compute both 7 and rho in our various analy- 
ses. Taken together, the results in Table 1 are 


Obs. Y 50.70 18.79 
nee striking evidence that, even with an inex- 
pi es 4153 847 998 998 .01 perienced observer, recordings of Interaction 
Obs Yy 4169 851 Chronograph patterns during standardized in- 
terviews are very reliable. For r, eight of 
S’s Silence the nine variables have reliability coefficients 
is nes ne 10 980 OL above .94, while six are above .98. The results 

j are equally striking for rho. 

S’s Tempo Since the standardized interview consists of 
1.000 .999 .01 fiye subperiods, it is of interest to ask how 


Obs. X 4584 742 
4585 736 reliable are the observations for these pe- 


Obs. Y 
S "5 riods in contrast to the interview as a whole. 
mectivity’ 998 .996 .01 Table 2 presents the subperiod observer reli- 


3722 990 
aoe 3753 1000 ability for a sample of three of the interac- 
tion variables: the number of S’s Units (ac- 
tions), the duration of S’s Actions, and the 


S's Adjust. * 
10 .398 01 x 

Obs. X F EA vee , duration of S’s Silences. The values of r 

Obs. Y a i within subperiods for these three variables 

Int’s Adjust. range from .787 to .999, despite the fact that 

ct 76.03 859 944 Ol each is based on only a small time sample of 


Obs. X —85.41 
—73.94 65.10 the total interview. The one relatively low 


Obs. Y 
, value of rho, .476, for S’s Units in Period 2, 
veoma 41.59 1710 928 948 01 iş a statistical artifact due to a number of 
Obs. V 32.53 144 tied ranks, as can be seer by the high value 
(.889) of the Pearson r for this same vari- 
Int.’s Units 704 17.26 993 999 01 able. Of the 45 period-variable combinations 
On 4735 17.55 (9 variables times 5 subperiods, of which 15 
p i are shown in Table 2), 10 of the Pearson r 


significant at the 01 level, while r observer-reliability values were 99; 20 were 


* For this variable rho is sig 
i i deviant cases. pain é : 
i pe scent a ce “Int. is Interviewer, Obs.” is Ob- 95 and above; 28 were 00 and bangs; ae 


\ Server, 
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Table 3 


Observer Reliability for Initiative, Dominance, 
and Quickness 


Mean 
raw 
Variable score SD rho d sA 
S’s Initiative (2) 
Obs. X 7.47 2.81 .765 .877 .01 
Obs. ¥ a53 330 
S’s Dominance (4) 
Obs. X 3.18 6.66 .532 .588 .05 
Obs. ¥ 6.59 7.37 
S’s Quickness (2) 
Obs. X —127.59 60.04 .941 .954 01 
Obs. Y —128.82 55.34 


40 (89%) were .70 and above.‘ Similar values 
were found for rho. Only one variable, S’s 
Adjustment, Periods 2 and 3, was found to be 
unreliable. These two instances of subperiod 
unreliability appear to be due in part to the 
very restricted number of observations rele- 
vant to S’s Adjustment which occur in Periods 
2 and 3. Since S can fail to adjust (interrupt 
or fail to respond) to the interviewer only 
when the interviewer himself has acted, the 
occurrence of approximately 3 and 5 inter- 
viewer’s Units in Periods 2 and 3, respec- 
tively, meant that S’s Adjustment in these 
periods depended upon very few—3 and 5— 
observations. Therefore, relatively small dif- 
ferences in observing one unit of adjustive 
behavior out of the three instances led to un- 
reliability between Os for these 2 subperiods 
for this variable. Since these were the only 
two instances of unreliability, it can be con- 
cluded that observer reliability is high for in- 
dividual subperiods as well as for the inter- 
view as a whole. 

Table 3 presents the reliability results for 
the three variables which are scored only in 
the stress periods—2 and 4—of the standard- 
ized interview. These variables are S’s Initia- 


4To save printing costs, these 45 period-variable 
correlations have been deposited with the American 
Documentation Institute. Order Document No. 5183, 
remitting $1.25 for microfilm or $1.25 for photo- 


copies. 
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tive, S’s Dominance, and S's Quickness. 
These variables, like those in Table 1, are usu 
ally divided by the number of S’s Units anc 
hence express average frequency or duration 
per unit. The individual raw scores were uti 
lized in the present study, however, as stated 
earlier. It is clear from the values shown in 
Table 3 that, despite the fact that these three 
variables are derived from only a small sam- 
ple of the total interview, the Os nevertheless 
attained considerable reliability (.01 level of 
Confidence for S’s Initiative and S’s Quick- 
ness, and .05 level for S’s Dominance). The 
finding of significant but lower observer reli- 
ability for the Period 4 S’s Dominance vari- 
able would seem to support our earlier hy- 
pothesis (12, p. 427) that the “fast pace” of 
Period 4, with both the S and interviewer 
talking at the same time, presents the O with 
the most difficult recording situation. Refer- 
ence to Table 2 of the present study sheds 
further light on this possibility, In this table 
the Pearson y Correlations for S’s Action and 
S’s Silence in Period 4 are extremely high 
(.999 and .940, respectively), while the 7 for 
S’s Units is somewhat lower (.787) but still 
at the .01 level. Such results suggest that the 
Os differ by only several hundredths of 4 
minute in recording how long an § speaks 
and is silent in Period 4, but that the ex 
tremely small differences in duration occa- 
sionally result in observer disagreements as t0 
whether S stopped acting before or after thé 
interviewer stopped acting and thus some- 
what reduce observer reliability for S’s Domi- 
nance. Likewise, the very small discrepancies 
in observer input for the duration measures 
may occasionally result in differences as t° 
Whether S$ stopped very briefly and then be 
gan a new unit or was continuously acting ip 


° Initiative: the number of times, out of the avail 
able number of opportunities (usually 12) in Per! i 
» in which S acted again following S’s own last ê 4 
tion. Dominance: the number of times in Perio 
that S “talkeq down” the interviewer minus Si 
number of times the interviewer talked owe S 
Quickness: the length of time in Period 2 A 
waited before taking the initiative following bis art- 
last action. Quickness is routinely scored in ocr the 
ment store applications (employee selection) A here 
Interaction Chronograph and is thus include sed it 
despite the fact that we have not heretofore AAE 
in our own research on interaction patterns 
interviews, 
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one unit. As is shown in Table 2, however, 
such minor differences in observer-input for 
duration measures apparently have little effect 
on the reliability of some of the variables (S's 
nits, S’s Silence, S’s Action), even in Period 
4, although they may result in more serious 
differences in the scores obtained for S’s Domi- 
nance (Table 3). However, the fact that S’s 
Ominance could yield an observer reliability 
Coefficient at the .05 level of confidence de- 
Spite the fast pace of Period 4 implies that, 
With further refinements in definition, and 
Possibly more observer practice in Period 4 
interactions, this variable may also be as reli- 
ably observed and recorded as the others. 


Discussion 


Considering the study as a whole, it is clear 
from the results presented that, with the pos- 
sible exception of the S’s Dominance variable, 
the observation and recording of interaction 
Patterns during the partially standardized in- 
terview is a highly reliable undertaking, The 
unusually high coefficients of correlation for 
the total interview (Table 1) imply that the 
observer’s task is largely a mechanical one 
once he has read, understood, and practiced 
the published rules as to what constitutes an 
action and an inaction (3, pp. 5-9; 8, pp. 
362-364). Observer response-sets or biases 
appear to have little effect upon the inter- 
viewee interaction record finally obtained. 

The high observer-reliability results, by 
themselves, do not fully answer a second ques, 
tion which motivated the present study: Was 
the high test-retest stability found earlier (9, 
12, 13) for interviewee interaction ore 
under conditions of the standardized in j 
view a function of “true” invariance n 
Predictable modifiability in interviewee char- 
or were these earlier findings in- 
e memory of the observer, his 
influenced by his mem- 


ory of input in original test? In eE Me i 
swer this question better, it ame ae 
design using one observer for the tesi aie 
and a different observer for the retes n 
would have been superior to the one we a 
(12). However, we were interested, in our 
i in varying interviewers an 
the limits of the design 


servers. The latter, we 


acteristics, 
fluenced by the memo 
input in retest being i 


initial studies, i 
could not, within 
chosen, also vary © 
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felt, had to be controlled while we studied 
our main dependent variable, a single inter- 
viewee’s interaction patterns across different 
interviewers. 

Despite the fact that a somewhat better 
design exists for testing the possible influence 
of the observer’s memory on our previous 
findings, the results of the present study carry 
strong implications which help us to assess 
this potential influence. There are several rea- 
sons why the present results tend to rule 
out any significant influence of the observer’s 
memory on the high reliability coefficients of 
the original studies. First, and most impor- 
tant, the results of the present observer-reli- 
ability study suggest that, neither in the case 
of an experienced observer nor an inexperi- 
enced one did observer response sets influ- 
ence the final record of any given S’s inter- 
action. In other words, as mentioned earlier, 
the task of the observer is a more or less 
mechanical one, so that extraneous response 
sets seem to have minimal influence on the 
final record. Second, the interaction variables 
which the observer records during a complex, 
live, two-person interview are swift moving, 
so that, even for a single S, it is hard to see 
how, during initial test, the observer could 
ascertain or remember such facts as, for ex- 
ample, that S had 87 units: each one on the 
average .57 of a minute in duration; his si- 
lences averaged .09 of a minute; his malad- 
justments to the interviewer took the form 
primarily of interrupting him, doing this 
about 24 times in Period 1; each for an ay- 
erage duration of .08 of a minute; he failed 
to synchronize 43 times in all; he took the 
initiative 6 times out of 12 in Period 2 and 
submitted 3 times more often than he domi- 
nated in Period 4; he talked on the average 
46 of a minute per unit in Period 3, but .72 
of a minute per unit in Period 5; etc. If an 
observer were capable of absorbing all this in- 
formation merely from observing an S in the 
initial interview (the observer never saw the 
machine record), which it is our belief he is 
not, he would still have to translate such data 
in all their complexity accurately from recall 
while the retest interview was going on, in 
order to ascribe the earlier findings of sta- 
bility to his memory rather than ascribing 
them to patient invariance. Data on human 
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learning would suggest that events as compli- 
cated as those we have been recording are not 
learned and recalled (even five minutes later) 
by a human observer with the high degree of 
reliability found in our first two test-retest 
studies. Third, even if it were possible for an 
observer to learn and remember the complex 
events investigated in the first two studies 
until retest a few minutes later, it is hardly 
likely he could have remembered such a com- 
plicated set of events over a seven-day pe- 
riod, as in our third study (13). This is espe- 
cially the case since he might have observed 
as many as 6 or 8 other Ss’ interviews in the 
seven days between test and retest. The in- 
fluence of both proactive and retroactive in- 
hibition would certainly tend to cancel out 
whatever memory “traces” the observer might 
have had. 

In view of these considerations and the re- 
sults of the present study, it seems far more 
probable to us that the role of the observer 
is a mechanical one, and that the stability of 
interviewee interaction patterns which we have 
previously reported is a “true” characteristic 
of interviewee behavior, and not an artifact 
of the method of observation. 

With the completion of the Present study 
it is our belief that all major aspects of the 
reliability of the Interaction Chronograph 
variables have been investigated. Thus to date 
we have studied the reliability of the inter- 
viewer who serves as the independent variable 
by following the rules of the partially stand- 
ardized interview (9, 12); the reliability of 
the interviewee interaction patterns, the de- 
pendent variables (9, 12, 13); the reliability 
of the scorer who scores the final Interaction 
Chronograph record (12, p. 429) ; and, finally, 
the reliability of the observer’s input (the 
present study). Having answered the ques- 
tions relevant to reliability, we have since 
turned to the question of the validity of the 
interviewee interaction patterns. 


Summary 


The design of our previous investigations 
made it impossible for us to study the pos- 
sible role of the observer in accounting for the 
high interviewee stability coefficients which 
we obtained in a partially standardized inter- 
view. Results of the present study, utilizing 
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one highly experienced observer and another 
observer with only minimal experience, inde- 
pendently and simultaneously observing the 
same 17 interviews, indicate that the role of 
the observer is a highly reliable, almost me- 
chanical one. For the interview as a whole, 
8 of the 9 major interaction variables yielded 
observer reliability coefficients (Pearson 7s) 
above .94. Equally high values were found for 
many of the five subperiods of the interview, 
while only one variable, S’s Adjustment, was 
found to be unreliable in any subperiod. The 
variable S’s Dominance in the fourth (inter- 
ruption) period showed only moderate (.05 
level of confidence) in contrast to the high 
reliability of the other major variables, With 
the present demonstration of observer reli- 
ability completed, and the earlier demonstra- 
tions of interviewer, interviewee, and scorer 
reliabilities, we plan now to devote further 
research efforts to the question of the validity ; 
of the interview interaction variables. 
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Manifest Anxiety and Test Taking Distortion 
of the Blind? 


Sidney I. Dean 


University of Portland 


An earlier study of blind subjects (1) in- 
dicated that the MMPI mean score psycho- 
graphs for both sexes yielded patterns within 
normal deviations on all clinical scales. This 
study utilizes two measures derived from the 
MMPI. The blind are said to be an anxious 
group; the Taylor Manifest Anxiety Scale 
(MAS) was used to investigate this. The 
Gough raw F minus raw K (F — K) was used 
to measure the défensive nature of the blind 
test-taking attitude. 

The subjects were 34 male and 20 female 
blind. They varied from totally blind to 
“good” travel vision, from blind at birth to 
recent acquisition, from good to poor adjust- 
ment by judges’ rating. The MMPI short form 

(first 366 items plus full K and Si scales) 
was administered verbally. Within these items 
there are 38 of the 50 originally used by Tay- 
lor, and manifest anxiety should be indicated 
if present. The F — K would indicate the va- 
riety of defense if it proves to be beyond “nor- 
mal” expectations. 

The MAS produced a multimodal distribu- 
tion with a mean of 11.09 and a median of 
10.66 (estimated with the full 50 items), sug- 
gesting /ess than “normal” manifest anxiety. 
The F — K yielded a mean of — 13.19 and a 
median of — 13.50, indicating an attempt to 
“look good” and deny that blindness is in- 
capacitating. The Composite score (Comp: 
number and degree of deviations from normal 


1An extended report: of this study may be ob- 
tained without charge from Sidney I Dean, Mental 
Health Clinic, Florence, South Carolina, or for a fee 
from the American Documentation Institute. Order 
Document No. 5179, remitting $1.25 for microfilm 
or $1.25 for photocopies. 


range) (1) gave a mean of 13.43 and a 
median of 13.70. The Comp and MAS are 
correlated .40, beyond the 1% level, and 
may both be expressions of “maladjustment.” 
F—K and MAS are also correlated beyond 
the 1% level at -57; as anxiety increases so 
does the attempt to look good. F—K and 
Comp are not significantly related at .18; as 
adjustment worsens, defensiveness does not 
systematically change. Analyses of variance 
were executed and none produced significant 
Fs. The ¢ tests between sexes indicated a com- 
mon population mean, but F tests for vari- 
ances all differed at the 1% level. 

Taylor has acquired different medians and 
means with different samples and variations 
of her scale. Results from other investigators 
are more helpful in evaluating the findings in 
this study. A deception index has been sug- 
gested for the F — K to include — 11 or below 
as “faking good,” but more definitive work 
has been done on the positive end of the scale: 

The blind appear to differ from both nor- 
mal and clinical groups on the MAS. They ap- 
pear to defend themselves by distortion in thé 
direction of “looking good.” Anxiety seems 1e- 
lated to worsening adjustment and greater de- 
fensiveness. Female blind subjects are moré 
variable in their responses. Acuity, occur 
Tence, and adjustment were not differentiate 
by analyses of variance, 

Brief Report. 
Received November 26, 1956. 
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Selection of Neuropsychiatric Patients for 
Group Psychotherapy’ 
i Leonard P. Ullmann 
VA Hospital, Palo Alto, California 
Commenting on research in group therapy Subjects, Criteria, and Procedures 
during 1955, Harris remarked: “Selection of A 
i : e RIRA Subjects 
Patients for groups continues to be a wide f h 
~ open’ area. For each article insisting that a As part of an ongoing project,” all patients 
certain type of patient is unsuitable for treat- receiving group treatment at a Veterans Ad- 
ment, there seems to be a corresponding pub- ministration neuropsychiatric hospital were 
lication reporting a positive group experience routinely rated on the Palo Alto Group Ther- 
with just such patients” (6, P- 139). The lit- aPY Scale by their group therapists. The pres- 
ent study examined as many of the patients 


draws heavily on therapists’ opinions and ex- 
perience (1, 5) and frequently is expressed in 
diagnostic and sociological signs (4, 15). With 
the possible exception of Kotkov and Mead- 
ow’s (8) finding that a patient is more likely 
to remain in treatment if FC is greater than 
CF on the Rorschach, the usefulness of psy- 
chological tests has been asserted rather than 
demonstrated (10, 12, 13). 
This study presents the relationships of two 
thematic tests to criteria of (a) patient be- 
= havior in therapy groups within two weeks of 
testing, and (b) hospital status six months 
after testing. In addition to meeting a prac- 
tical need where the number of group thera- 
pists is limited, significant test-criteria rela- 
tionships may generate hypotheses about the 


usefulness of group treatment. 
erans Administration Hospital, Palo 


Alt lifornia. This report is based in part on a 
i tem submitted to the Department it Padel 
ogy and the Committee on Graduate sudno f an- 
ford University, May 1955. The ritr es a 
press his appreciation to Drs. C. r mo ane uih 
McNemar, and Sanford Dean for their ge Gh f aid 
in this investigation. Material was collected al the 
VA Hospital, Palo Alto; cooperation of the hospi al 
staff and Drs. Wesley Becker, Glen Brackbill, Rober 
McFarland and Donald Shannon, who ere mi 
terial, is gratefully acknowledged. Part o! Me end 
was read at the 1955 convention of the Ame 
Psychological Association. 


| erature of patient selection for group therapy 


1 From the Vet 


as possible on whom group ratings were made 
during an eight-week period. The sample con- 
sisted of 72 patients, 60 of whom were test- 
able, who were administered the two tests 
described below within two weeks before or 
after being rated on the group therapy scale. 
The patients used represent the entire range 
of adjustment as rated on the Palo Alto 
Group Therapy Scale and came from twelve 
groups with ratings made by ten therapists. 


Criteria 

Finney (3) has described the construction 
and validation of a method for rating pa- 
tients’ behavior in discussion-type therapy 
troups. The Palo Alto Group Therapy Scale 
measures adequacy of interpersonal relation- 
ships as manifested in the group therapy situ- 
ation. The scale is a checklist of 88 items. 
The patient’s score is the number of items 
which indicate good interpersonal relation- 
ships. Finney (3) reported that for 18 groups 
in a neuropsychiatric hospital a median rank- 
order correlation of .84 was obtained between 
scores on the scale and global rankings by 
group leaders. In the same study, a rank or- 
der correlation of .80 was obtained between 
the average ratings by ten ward personnel 


2 Thanks are due Dr. Ben Finney who made data 
from his researches available for this study. 
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of adequacy of interpersonal relationships 
throughout the hospital and the patients 
scores on the group therapy scale. The Palo 
Alto Group Therapy Scale offers a way of 
quantifying patient behavior in therapy groups 
in terms which are meaningful within the 
framework of group interaction and within 
the larger context of the hospital. 

In the present study, the criterion of hos- 
pital status divides the sample into (a) those 
patients who, six months after the completion 
of testing, were either discharged or on a trial 
visit which had lasted at least 90 days, and 
(b) those patients who did not meet this cri- 
terion. Of the 60 testable patients, 26 were in 
the improved hospital status group and 34 
were not. Two of the twelve untestable pa- 

tients were in the improved hospital status 
group. While the criterion of improved hos- 
pital status may overlook important gains 
made by patients who remain in the hospital, 
it is used in this study as a rough, practical 
measure of improvement., 


Test Devices 


Two tests, each administer 
minutes, were used, The first test consists of 
six TAT cards. Cards 4, 6BM, 7BM, 13MF, 
15, and 17BM were selected after a review of 
work by Eron (2) and Weisskopf (16) sug- 
gested that these pictures are ones beyond a 
minimum level of transcendence on which sub- 
jects tend to produce a greater than average 
number and variety of thema. The TAT pro- 
tocols were scored by a clinician familiar with 
the group therapy scale whose task was to 
predict scores on the group therapy scale from 
the total TAT protocol. To find rater reli- 
ability, two other clinicians ranked the first 
third of the 60 TAT protocols as to predicted 
group therapy scores. The rank-order correla- 
tions between pairs of these three raters were 
-76, .71, and .68 for the 20 cases, 

The second test was a new set of stimuli 
designed for this study called the Social Per- 
ceptions Test (14). The test consists of twelve 
line drawings of people who are faced with 
conflicting socially approved reasons for ac- 
tion which are made explicit by their gestures 
and speech inserted in cartoon “balloons.” 
The conflicts ceater around a reason for ac- 
tion which is relatively more self-satisfying as 
Opposed to a reason which is more self- 


ed in 15 to 20 
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strictive. The subject is instructed that he is 
to take a test of his knowledge of human nay 
ture, and that after looking at each of the pic- 
tures he will be asked some questions. These 
questions are (a) why should and (b) why 
should not the picture hero do some action 
suggested by the explicit reasons given in the 
situation, and (c) kow would the person feel 
if he did and (d) did not do the action. The 
answers to the first two questions, which deal 
with reasons for an action, are scored on 4 
scale of recognition of social motivations de- 
scribed in detail in a test manual (14). For 
the first 20 testable cases of the sample, the 
reliability of two raters for the scoring of 
recognition was a product-moment correlation 
of .99, Responses to all four questions are 
used to derive a score of the subject’s ability 
to feel motivations in complex social situa- 
tions. This scale is essentially one of the de- 
gree to which the social motivations have been 
internalized and are felt to be important to 
the welfare of the picture hero, The reliability 
of two raters Scoring the first 20 testable cases 


of the sample on the scale of ability to feel 
motivations 


lation of .92. For both the recognition and 


feeling Scores, the scores of the odd num- 
bered pictures correlated .87 with the scores 
of the even numbered Pictures, for the 60 
cases in this sample. Correlations between the 
TAT ratings and the Social Perception Test 
Scores in this study were -60 for recognition 
and .63 for feeling of social motivations. 


Results 

Estimates of the 
from TAT protoco! 
Product-moment ¢ 
testabl 


group therapy scale scores 
ls by the expert yielded 2 
orrelation of .58 for the 60 
e e cases. Ability to recognize motiva- 
tions as measured by the Social Perceptions 
Test correlated .46 with the criterion of group 
therapy scale ratings for the sample of 60 
cases. The score of ability to feel motivations 
as measured by the Social Perceptions Test 
Correlated .59 With the criterion for the 60 
testable patients, The Correlations of the TA 1 
estimates and the feeling score of the ee 
Perceptions Test with the group therapy s¢# 
ratings are Statistically significant beyond 
-0001 level. Of the twelve untestable case*: 
three were above the median of the group A 
testable patients on the group therapy 562% 


yielded a product-moment corre- 
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Untestability may be taken as a tentative in- 
dication of poor prognosis of adequate in- 
terpersonal relationships in a group therapy 
Setting. 

In this research, hospital status six months 
after the completion of testing was used as a 
rough measure of improvement, and therefore 
as a possible criterion for the selection of 
group members. Biserial correlations were 
computed for the 60 testable patients using 
scores of the projective techniques as the 
graduated variable. The ratings of the TAT 
and hospital status six months later yielded a 
biserial correlation of .41, which for 60 cases 
is significant beyond the .01 level of statisti- 
cal significance. The biserial correlation be- 
tween recognition and hospital status change 
for the 60 cases was -52, significant at the 
001 level. The biserial correlation between 
hospital status and the ability to feel social 
motivations as measured by the Social Per- 
ceptions Test was .58 and is significant at the 
0001 level of statistical significance. Since 
ten of the twelve untestable patients were in 
the hospital at the time of follow-up, untest- 
tentatively considered a poor 
improved hospital status. Add- 
ing these cases wou 
of patients correctly identified by both pro- 
jective techniques. Biserial correlations be- 
tween ratings on the group therapy scale and 
hospital status yielded correlations of .31 for 
the 60 testable patients, and .36 for the total 
sample of 72 patients. These relationships are 
between the .05 and .01 levels of statistical 


significance. The results indicate that the tests 
used in this study, although significantly re- 
lated to the criterion of scores on the group 
add information beyond that 

sure when change of hos- 
i months is the criterion. 
Sante th n which the test devices 


rou e $ 
meng to indications of good in- 


relationships and therefore, sup- 
oa future positive change in oe 
status, it was found that the tests used were 
i tly “hitting” correctly the change 
han the group 

After Yates’s correction 
number of hits 
ge criterion as 
scale scores, 


e ratings. 
lied, the greater 
of the dischar: 


the TAT 
Dr group therapy 


compared to the 
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yielded a chi square of 3.35 significant be- 
tween the .10 and .05 level (9, pp. 228-231). 
By the same method, when the group therapy 
scale ratings disagreed with the scores of 
ability to recognize motivations as measured 
by the Social Perceptions Test, the differ- 
ences favored the test score as a predictor of 
changed hospital status by yielding a chi 
square of 4.26 significant beyond the .05 
level. Ability to feel social motivations as 
measured by the Social Perceptions Test cor- 
rectly identified 17 of the 19 cases where there 
was disagreement with the group therapy 
scale ratings as to the criterion of changed 
hospital status. The difference between group 
therapy scale ratings and ability to feel social 
motivations, after Yates’s correction had been 
applied, yielded a chi square of 10.32, signifi- 
cant beyond the .005 level. 


Discussion 


The results obtained in this study may be 
used to draw tentative conclusions as to what 
type of test material is most likely to be of 
value in making predictions to group therapy 
criteria. Thematic test material, which elicits 
responses more similar to the behavior on 
which group criteria are based, gave better 
results in this investigation than did the Ror- 
schach in previous studies cited above. With 
normal subjects, this contrast may also be 
noted. Pepinsky et al, (11) obtained rela- 
tively poor results with the Rorschach, while 
Horwitz and Cartwright (7) found many sta- 
tistically significant relationships when a pic- 
ture of a small group meeting was used as 
a projective device. Drawing on the present 
study and TAT techniques such as those of 
Zimet and Fine (17, 18), it seems likely that 
data which deal with the intensity and ap- 
propriateness of interpersonal feelings may be 
a particularly germane source of information 
for the selection of patients for group therapy. 
Such test responses offer a pool of behavior 
from which scores can be derived that will 
cut across diagnostic and sociological cate- 
gories and, through the extrapolation of simi- 
Jar behaviors, may be related to criteria of 
readiness for or progress in group therapy. At 
the present time, the use of such psychologi- 
cal measures seems to be the best means of 
resolving the muddled situation described by 


Harris (5). 
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New Tests 


Aliferis, James. Aliferis Music Achievement Test. 
College entrants. 1 form. (40) min. Test booklet, 
Pp. 8 ($3.00 per 20); scoring key (50¢); tape re- 
cording ($9.50); manual, pp. 28 ($3.00); speci- 
men set ($3.75). Minneapolis: Univer. of Minne- 
Sota Press, 1954. 

Evidence in the manual shows that this test is a 
carefully constructed measure of the auditory-visual 
discrimination of melodic, harmonic, and rhythmic 
elements and idioms of music, for use at the college 
entrance level. The test may be administered with 
the piano, but the use of a tape recording is recom- 
mended for uniformity. In the test booklet, the ex- 
aminee makes multiple-choice responses to indicate 
the musical notation of what he hears. Reliability of 
the whole test is reported as .88, and of three sub- 
scores, .84, .72, and .67. Data from four midwestern 
universities show that the total score correlates .61 
with grades in first-year music courses, in comparison 
to a correlation of .25 with grades in academic 
courses. For individual guidance, the author recom- 
mends that the test be used in combination with a 
test of sensory discriminations such as the Seashore, 
an appraisal of performance on an instrument, and 


a verbal intelligence test—L. F. S. 


Gordon, Leonard V. Gordon Personal Inventory. 
High school, college, adult. 1 form. es) ain, 
Question booklet ($2.45 per 35), with key, manua f 
pp. 8; specimen set (20¢). Yonkers, N. Y.: World 


Book Co., 1956. 

The Personal Inventory is presented as a measure 
of four factors of personality: cautiousness E 
original thinking (0), personal relations a); an 
vigor (V). It is a companion instrument to tl ea 
thor’s previously published Personal Profile w. u 
obtains scores on ascendancy (4), responsibility (. i 
emotional stability (E), sociability (S), and Sees 

f-evaluation (T) (J. consult. Psychol., 1954, 8, 
= Like the earlier questionnaire, the Inventory is 
iei by brevity, the use of tetrads of ae 
controlled for social desirability, and a 
identification of components by e e 

uction shows a 
a ae The reliabilities of the four new 


ments 


scores range from .77 to .88 in college and high school 
groups. The traits are relatively independent of one 
another (— .06 to .47), and of the four traits meas- 
ured by the Profile (— .16 to .47). Correlations with 
intelligence are mainly insignificant. Norms, rightly 
called tentative, are based on about 500 cases from 
each of four groups: high school boys and girls and 
college men and women, all from the same section 
of the United States. Validation is discussed only in 
terms of the construct validity inferred from the fac- 
tor analysis. Users are cautioned appropriately against 
drawing conclusions from small differences in scores. 
The two Gordon questionnaires commend themselves 
favorably for use when economy of time is essential. 
Few other instruments obtain as broad a picture of 
self-reported personality in less than 30 minutes— 


L. F. S. 


Kuder, G. Frederic. Kuder Preference Record—Occu- 
pational, Form D. High school, college, adult. 1 
form. (20-30) min. IBM or hand scoring. Ques- 
tion booklet, pp. 11 ($9.80 per 20); answer sheet 
($6.25 per 100); occupational keys ($1.00 each) ; 
manual, pp. 12 (50¢); specimen set ($2.00) ; re- 
search handbook, pp. 47 ($2.50). Chicago: Science 
Research Associates, 1956. 

The newest member of the Kuder family is a blank 
designed to measure the resemblance of the examince’s 
interests to those of persons in specific occupations. 
Like the author’s Personal (Form A) and Vocational 
(Form C) preference records (J. consult. Psychol., 
1949, 13, 67), the new inventory was developed with 
great competence, and its manuals communicate in- 
formation of wide scope. The blank consists of 100 
triads of statements which best represent each area 
of the preceding questionnaires and have minimal 
correlations with other areas. In view of relations of 
all 15 of the areas covered by Form A and Form (e: 
to occupational choice or job satisfaction, this repre- 
sentative pool of items seems likely to provide a suffi- 
ciently broad base for discriminating among occupa- 
tions. Each statement describes an activity; names 
of occupations are avoided in order to promote 
subtlety of discrimination. The vocabulary is at or 
below the sixth grade level. Each occupational key 
was developed by comparing the responses of at 
least 100 persons in the occupation, usually more, 
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282 > È Psychological Test Reviews 
: i ior 
a R ers, H. H. & Bauernfeind, R. H, SRA Juni 
i 1,000 males. For each emmers, 5 p” 
eee oe of a ere Ge cases is reported. Inventory, Form S (Rev.). Grades me 1 20." 
ee ao d as “differentiation ratios”: (40) min. Question booklet, Pp. 8 ($2. p soa 
oo a aa. with which the ex- pupil profile ($1.05 Per 20) ; manual pp. an ae 
ee Se A as a member of the cri- Specimen set (50¢) Chicago Science Resea 
a DRA Keys are presently available for six Sociates, 1955, 1957, 


chol., 1952, 16, 160). Instead of responding by mah 
have sought to give excessively ing only the statements Which represent pro l 
ee oo of iene The Research for him, a Pupil now indicates whether the item 
Handbook is the Most exceptional adjunct. It reports “ a middle-sized problem; 
i Fur- “a little Problem,” or “no Problem.” Weighted scor- 
ther, it gives explicit instructions for the constructi 
of new keys for local purposes, by differentiati 
a criterion group from the general norm 


an expendable booklet, in- 
Stead o using a separate answer sheet which oe 

is itself a g examinees. The profile includes 
ve areas: school, home, myself, people, 

and general, The revised form is clear] i 
Tecord commend its use m 
in guidance, employment selection, 
few faults, mainly stemming from 
er of keys available, are almos 

ahd, 


ton provided jn the manual, and 
Temedied—z, p 


the suggested interpretations.— 
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The Factorial Structure of the WAIS between Early 
Adulthood and Old Age™ 


Jacob Cohen 


Franklin D. Roosevelt Veterans Administration Hospital 
and New York University 


The publication of the Wechsler Adult In- 
telligence Scale (WAIS) (8, 20) has, for the 
first time, made it possible to investigate the 
factor content of a battery of individually ad- 
ministered reliable intelligence subtests on a 
large well-selected adult standardization popu- 
lation over a wide age range. Study of these 
data should provide some insight into the 
aging process. Further, it makes possible an 
estimate of the factorial equivalence of the 
WAIS and its predecessor, the Wechsler-Belle- 
vue Intelligence Scale, which has been sub- 
jected to considerable factor analytic study 
(1, 2, 4, 5, 7, 11, 12, 13, 16). Finally, it yields 
a factor analytically based rationale for the 
measurement functions of the subtests on a 
normal population which supplements a simi- 
lar analysis of these subtest item types per- 
formed on the Wechsler-Bellevue for neuro- 

_ logical and psychiatric patients (5). 


Method 


Subjects . 

The four age groups studied were as fol- 
lows: ages 18-19 (N = 200), ages 25-34 
(N = 300), ages 45-54 (N = 300) and ages 
60-over 75 (N = 352). The first three groups 


1From the Psychology’ Service, Veterans Admini 
tration Hospital, Montrose, New York. The m or 
is in many ways obligated to Drs. Leon L. Pa) oy: 
Manager, George Rosenberg, Director of Pro: pee 
Services, Oskar Diethelm, Chairman, Deans ai 

i and Seymour G. Klebanoff, Chief, Psychology 
ee. but mostly for creating the type of atmos- 
habe ducive to research. A special debt of grati- 


phere con incurred to Professor David Wechsler, who 


arne his investigation. 
i is in . g d 
eg EEA based on this investigation was read at 


the meetings of the American Psychological Associa- 
tion in Chicago on September 5, 1956. 


were part of the regular standardization sam- 
ple, a sample stratified by age, sex, geographic 
region, urban-rural residence, Tace, occupation, 
and education (20). The 60-75 age group was 
a supplementary standardization group ob- 
tained in the Kansas City area which was se- 
lected so as to make it representative of the 
over-60 group in that area (8). Of the 352 
cases in this group, 160 were men and 192 


women. 


Analysis 

For the younger three groups, the matrices 
of intercorrelations among the subtests given 
in the manual (20) were used. For the Kan- 
sas City sample, Doppelt and Wallace pro- 
vide separate matrices of intercorrelations for 
the four age groups (60-64, 65-69, 70-74, 
and 75 and over) which comprise the total 
sample of old persons (8, pp. 324-327). These 
four matrices were combined into a single 
matrix by averaging via Fisher’s g transfor- 
mation the 66 sets of four coefficients between 
the same subtests (4, pp. 133-134), 

The four matrices thus obtained were sepa- 
rately subjected to the following analysis: 3 


1. Thurstone’s complete centroid method was used 
(18, pp. 161-170), with communalities estimated by 
his Equation 15 (18, pp. 300, 318). The solution was 
not reiterated. Three criteria were used to test for 
the completion of the extraction Process, those of 
Saunders (3, pp. 300-301), McNemar (14), and Burt 
(6). In all four instances, Saunders’ criterion re- 
sulted in the extraction of five factors, while the 


šTo save printing costs, tables giving centroid 
loadings and communalities, transformation matrices 
and intercorrelations among primaries have been de- 
posited with the American Documentation Institute, 
Order Document No, 5277, remitting $1.25 for micro- 
film or $1.25 for photocopies, 
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2. Rotation was done blindly 
the four groups, using Thurstone’s method of two- 
dimensional Sections (18, pp. 194-216). The rotation 
was to a criterion of oblique 


recommended by Cattell (3, pp. 235-236), 


© a second-order 
general factor analysis (18, PP. 273-277, 421-434), 


by Thomson (17, pp. 189-191) 


Results and Discussion 
| Five factors resulted in al] bu 


as significant 5 and 


» to facilitate 
are followed by * 


younger groups, this dis 
crepancy was Tesolved What had oce Tred was that 
the fourth and fifth centroids ha i 


ed as follows: This Investigation can be looked u on 
> a set of four replications of 


erefore, 
lard against the acceptance of n 


y, the 


45-54 
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Table 1 
Factor A Loadings of the Four Age Groups 
on the WAIS 
Age 

Subtest 18-19 25-34 45-54 60-75+ 
Inf. 30* 21+ 36 29* 
Comp 33* 45** 27+ 39 
Arith 09 01 06 04 
Sim, 23* 20 30% 42 
D. Sp. =10 -05 = _96 00, 
Voc. 24* 4g** 37* 37 
D. Sym 08 04 01 + —06 
P. C. 07 03 04 07 
B.D. =03 üi 00 0, 
P. A. 14 06 00 30 
0. A. —08 00 00 —01 


ganized by factor and show the loadings of all 
age groups a factor at a time. 


has emerged in all 
studies of the Wechsler-Belle- 

8roups at different age levels 
2, 13, 16) as well as in psychi- 
» 5, 13, 16). There is little 
is the same factor found in all 
ions in the intellectual do- 


that at advanced ages at least 
“telling the story 
achieve solutions, 
ves in the ability to 
ay also be related tO 
d deficiency of this group 1? 
> in the next youngest groui 
carry a loading on this tes 


themselves in order to 
and vary among themse] 
do so Successfully, Itm 
the a ready note 
Factor E, which 
(45-54 » does 
(Table 4). 


Factor p. Perceptual Organization 


5 t- 
The two tests loading this factor asa 
ently in al] four age groups are Block Des 
and Object 


he , 
Assembly. Present in all but able , 
with just barely accep 


A 
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loadings, is Picture Arrangement. A significant 
loading for Digit Symbol appears only for the 
oldest group. 

This factor has been -consistently found in 
factor analyses of the Wechsler-Bellevue, al- 
though it has variously been identified as per- 
formance (1, 13, 16), spatial-perceptual (12), 
closure (2), and nonverbal organization (4). 
Slight differences in the subtests which have 
loaded this factor are attributable to differ- 
ences in rotational criteria (1) and insuffi- 
ciency of factor extraction due to relatively 
small Ns (2, 13, 16). The common content 
of these interpretations is the non-verbal, 
perceptual, organizational characteristics. Be- 
cause of the loadings of Picture Arrangement 
and Digit Symbol, this factor resists inter- 
pretation as a spatial factor. Clearly, however, 
both speed and quality of perceptual perform- 
ance and organization are involved; hence, it 
is here named Perceptual Organization. It 
very likely represents a few highly correlated 
factors such as Thurstone’s Perceptual Speed, 
Closure, and Spatial Relations (19), which do 
not emerge separately because of the paucity 
of tests in this subdomain of the intellectual 
sphere. With reference tests included in the 
matrix, Davis (7) achieved such a further 
fractionation of this factor in the Wechsler- 
Bellevue. 

No hypothesis is offered to account for the 
failure of Picture Arrangement to load signifi- 
in the 45-54 group. The appearance of 


cantl 
f ver 75 group is note- 


Digit Symbol in the 60-0 
Table 2 


Factor B Loadings of the Four Age Groups 
i on the WAIS 


Age 

Subtest 18-19 25-34 45-54 60-75-+ 

o -06 0s —09 
ma o 03 -% 0l 
Arith. 10 00 02 06 
Sim. —07 —10 06 10 
D. Sp. —09 03 00 09 
Voc. —09 07 —03 —02 
D. Sym. 14 09 04 29* 
P.C. o œ o 
B.D. 34" 30" = 35* 58* 
P.A. me 22-03 20 
O.A. 34s 45t 31t  56* 


Table 3 
Factor C Loadings of the Four Age Groups 


on the WAIS 
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Age 
Subtest 18-19 25-34 45-54 60-75+ 
Inf. 04 10 10 41** 
Comp. 09 —10 08 38* 
Arith. 32" 32* 33* S25% 
Sim. 05 —02 —10 10 
D. Sp. 28* 24* 35° 45** 
Voc. 09 —02 03 Stet 
D. Sym. 08 —02 09 21* 
BAC, —01 —06 02 01 
B.D. 12 07 01 09 
Pl =10 09 —02 —09 
0. A. 05 —01 00 —03 


worthy, since it parallels a finding in a previ- 
ous investigation of neuropsychiatric groups 
with the Wechsler-Bellevue (4, 5). There it 
was found that although Digit Symbol did not 
load Factor B significantly in psychoneurotic 
and schizophrenic groups, it loaded quite 
heavily in the brain-damaged group. This was 
interpreted as due to the fact that “. .. a 
greater part of the variance of this test in the 
brain-damaged is associated with visual or- 
ganization and simple speed than is the case 
with the non-brain-damaged” (5, p. 276). 
This interpretation may well apply here, the 
operative factor being the brain damage which 
occurs (differentially) with senescence, and 
which results in enough perceptual speed and 
discrimination variance for it to load Factor B. 


Factor C: Memory 


For the youngest three groups, Arithmetic 
and Digit Span are the only two subtests load- 
ing Factor C. This factor, too, has appeared in 
most factorizations of the Wechsler-Bellevue, 
and although its test composition was not al- 
ways identical, it has always been character- 
ized by high loadings on Arithmetic and Digit 
Span (1, 2, 4, 5, 13, 16). It has variously 
been interpreted as memory (1, 2), freedom 
from distractibility (4, 5), attention-concen- 
tration (13), and concentration-speed (16). 
These interpretations are not as diverse as 
they may seem at first glance; it is not un- 
reasonable to suppose that effective memory 


Teproduction phases 
of the learning-remembering Process. Because 


more inclusive Concept, and 
found Digit Span to load only 


Phenomenon 


occurs in the 
for the 60-9 


interpretation of C as 
anything, it substantiates jt The hypoth, 

is offered that with sene: i 

als begin to deteriorate, or to state it More 
accurately, individual 
deterioration occur, This deterioration is re- 
flected in difficulties in the memory function 
which becomes an important source of indi- 
vidual differences (hence, score variance), and 
is more widely influential in its effect on test 
performance, invading the verbal Subtests and 
Digit Symbol. Obviously, to Tespond success- 
fully to Vocabulary or Information items at 
any age requires that the information be re- 
membered, but until old age is reached, the 
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Table 4 
Loadings of Factors D and E of the Four Age Groups on the WAIS 
Factor D Factor E 
A en o 
Age Age 
a x 
— = 5-54 60-75 18-19 25-34 45-54 
Sube 18-19 25-34 45-5 5+ - : 
5 —03 
04 15 —08 10 01 02 
ce el iar -08 -07 —10 
eun =z% o, ü 0 o ~09 ~o4 
Sil oie i5 02 92 00 17 06 
D. Sp. 08 = ~05 02 03 26* 10 06 
Voc. 16-09 93 _o9 07 02 10, 
D. Sym. —08 01 00 20* 24* 25* 31 
P.C. Boe 538 S Gx -0% 06 0 
B.D. 03 04 =~ 93 00 00 00 02. 
P.A. 06 = —93 22* 10 04 04 22 
0. A. 04  —o6 04 03 0g 00  —06 
ability, articularly in a test Situation, is de- variance in memo ability involved in the: 
F x d dif- 
pendent upon the ability to attend during tests is inconsequential, With old age an 
both the reception and 


> , A se 
ferential rates of deterioration, scores on tao is 
verbal tests come to depend as much or m 


on memory ability as they do on verbal com- 
Prehension ability, 


Factors D and E 


Factor D loads Picture Completion consist- 
ently in all four 


xiriously with Similarities (18-19). Picture 
Arrangement 
60-over TS) for 
perfectly inconsistent, are not trustworthy a 
use in the interpretation of the factor. Fac 


similar state of affairs exists for F. actin 

It loads Digit Symbol Consistently in the a 3 
groups in which it appears, alone in the 2 vith 
Stoup, with Digit Span (18-19), and vee 

icture Arrangement (45-54). Again, be ter- 
tor is a quasi-specific which is left noA 
preted other than identifying it as a Digi 
Symbol factor, Its deficiency in the Hd 

5 group is not Surprising. In this group, e i 
Symbol gives Up its communal vanes 
atypical loadings on Factors B, C, al by 
Apparently, the specific ability deman to 

igit Symbol at younger ages ceases r fat 
important in Senescence, and three othe: 
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tors which had previously not affected Digit 
Symbol begin to do so. 

With regard to both Factors D and E, it 
should be noted that although they may be 
encroached upon by error, the consistency 
with which they load Picture Completion and 
Digit Symbol, respectively, makes them un- 
questionably real phenomena. They are minor 
factors which give rise to small portions of 
the total communal variance, and, being spe- 
cifics, do not help in the understanding of 
what the subtests measure, except negatively. 
They do have some importance, however, in 
bringing the results of this study into line 
with those of factor-analytic studies of the 
Wechsler-Bellevue. The discrepancies have, on 
the whole, been minor, and are partly at- 
tributable to different rotational criteria (or- 
thogonal vs. oblique) and insufficient rotation. 
The present results suggest that these re- 
seachers failed to extract enough factors. 
They were justified in so doing by their 
relatively small Vs, and the risk of accepting 
as real a factor which represents only sam- 
pling error. In the present investigation, the 
fact that four large groups were studied inde- 
pendently and with blind rotations gives much 
confidence in the reality of the results. Previ- 
ous studies, in extracting only three factors 
(the typical number), had the Picture Com- 
pletion test loading the B factor (12, 13, 16), 
or both the A and B factors (2, 4, 5), and the 
Digit Symbol subtest loading the C factor (4, 
5, 12), the B factor (4, 5), or both the B and 
C factors (2, 16). These two tests have been 
the ones showing by far the greatest incon- 
sistencies among studies, the reason for which, 
it is suggested here, is simply the consequence 
of the technical problem of insufficient factor 
extraction rather than of true differences in 


factor make-up. 


The Second-Order Analysis 

The 36 jntercorrelations among the primary 
factors for the four groups (see footnote a 
ranged from 43 to 89, with a median of 70, 
the middle 18 falling between .61 and .80, in- 
clusive. This gave rise to a strong general sec- 
ond-order factor. Table 5 gives the cörelpon 
of the subtests and the primary factors ie 
G, which is interpreted as present genera Pe 
tellectual ability. The G correlations of the 
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subtests are quite high (median .70), and of 
the primaries even higher (median = .85). 
Given these magnitudes, there can be little 
question that the WAIS Full Scale IQ is 
loaded very strongly with G for adults, cer- 
tainly up to the age where deterioration be- 
gins to set in. 

As has been found in the past (4, 5), the 
“essentially verbal” subtests, i.e., those load- 
ing the A factor, particularly Vocabulary and 
Information, are the best measures of G over 
most of the adult age range, and Object As- 
sembly and Digit Span the poorest. As do 
most generalizations about our data, this 
breaks down partly for the 60-over 75 group. 
Not only does the G variance fall off in this 
group (see below), but the Factor A tests are 
no longer clearly the best measures of G. In 
particular, the correlation with G of Vocabu- 
lary, both absolutely and relative to those of 
the other subtests, falls off slvarply, and pre- 
sents yet another factor-analytic validation of 
a consequence of the Babcofck hypothesis (4, 
5). That is, Vocabulary ‘does not measure 
present general intellectua), ability in the aged 
as well as it does in you‘nger normal groups, 
or as well as some other tests do. The hy- 


Tattle 5 
Correlations with G of VAIS Subtests and Primary 
Factors for the Four Age Groups 
1 


iy Age 
A 

Subtest T 1849 25-34 45-54 60-75+ 
Inf. 383 84 82 73 
Comp. 69 71 76 60 
Arith. 6g 71 74 63 
Sim. ESI 75 75 65 
D. Sp. n63 59 65 48 
Voc. 4136 79 83 66 
D. Sym. C56 64 65 71 
P.C. 716 72 77 75 
B.D. 371 71 69 65 
P. A. 4:56 69 74 65 
0. A. r65 59 68 58 
Primary fo 
Factor We 

A i¢ +36 90 76 

R, A E ea 

c vet: 81 92 89 62 

D } aa ESSA 9 92 

E re 79 86 78 = 

a a e 
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Table 6 


Per Cent Contribution to the Total Variance of G and 
ry Each Factor Specific for the Four Age 
Groups on the WAIS 


Orthogonal Factor a 
G N P œC pt FF % 


Age Group : 
19 52.7 45 60 23 24 i7 69. 
= 50.0 52 53 1.7 25 1.3 66.0 
45-54 54.3 4.2 28 23 0.9 1.6 66.1 
60-75+ 421 59 61 97 05 — 64.7 


pothesized reason for this is that Vocabulary, 


it is apparent that about 
two-thirds of the total variance of the eleven 


he general factor is responsible for the shar- 
ing of about one-half of thé-stota] variance or 
about three-quarters of the 
ance (at least for the Youp.ger three groups), 
and is by far 


D 
and this difference is inStructive, While the 


fluential in producing correlation for older per- 
fons, that is, a differtiaiteign of intellectual 
functioning ocrivs with advance age. 

Since the above occurs with, ho great reduc- 
tion in the communal varans, g e comple- 
mentary finding is that the indep 


X is puent con- 
tribution of the primary factors is g eater in 
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the aged group. This is true for the three ma- 
jor factors, but most strikingly for the Mem- 
ory factor. Whereas the contribution of Fac- 
tor C to the total variance for the three 
younger groups averages 2.1%, for the 60- 
Over 75 group it is 9.7%. Thus, it appears 
that with advanced age, and its attendant dif- 


ferential rates of deterioration, general ability — 


plays a lesser role, and Memory a greater role 
in WAIS Performance, 


Subtest M Casurement Functions 


Implicit in the Previous material is a psy- 
chometric rationale for the measurement func- 
tions of the subtests which will be made ex- 
Plicit in a later article. One aspect of this is 


to measure distinct Psychological functions. 
Table 7 presents for the three younger groups 
the specificities of the subtests, i.e., the per: 
centage of the total subtest variance which is 


variance measured by only that 
by subtracting from the 
Subtest reliabilities at each age (20) their as- 
Sociated communalities as found in the analy- 
sis. The Specificities for the 60-over 75 group 
are omitted due to the unavailability of reli- 
ability coefficients, The values for Digit Sy E 

ol are enclosed in parentheses because its re 
liability was obtained for a young group ° 
nursing Students not strictly comparable wit 
the Standardization samples, 


Table 7 


Subtest Specificities for the WAIS for 
Three Age Groups 


Age 

a A o 
Subtest 18-19 25-34 45-54 
Inf. 10 10 A 
Comp, 15 08 gs 
Arith, 19 22 17 
Sim. 13 20 5 
D. Sp. 19 24 Ae 
Voc. 10 04 vA 
D. Sym G9 eo Gi 
RC; 11 16 18 
B.D. 14 19 42 
P.A. 10 02 10 
DA 00 06 
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As can be seen in Table 7, the values are 
quite small with a median of 14%. Thus, on 
the average, only one-seventh of the subtests’ 
variance is not attributable to common fac- 
tors and error. Under these circumstances, the 
attribution of specific measurement functions 
to the subtests as has been done by such cli- 
nicians as Rapaport (15) in connection with 
the Wechsler-Bellevue, is clearly unjustified, 
as has already been noted (5). 

Further, these small specificities account for 
the essentially disappointing results of the re- 
search efforts of the past ten years in pat- 
tern analysis with the Wechsler-Bellevue. The 
specificity of the Wechsler-Bellevue subtests 
is even smaller than those of the WAIS be- 
cause of the lower subtest reliabilities. Since 
subtest score differences are largely reflective 
of relatively small specificity variances and 
relatively large cumulated error, it is not sur- 
prising that these research efforts have con- 
stituted a futile quest for a philosopher’s 


stone. 


Age and Intellectual Organization 


A review of Tables 1 through 5 reveals a 
remarkable degree of similarity among the 
youngest three groups with regard to intel- 
lectual organization. In the case of the ma- 
jor factors, with only one minor exception,° 
wherever a factor loading exceeds .20 in any 
one group, it does so in both of the others. 
The evidence seems impressive that the or- 
ganization of intellectual functioning (on the 
WAIS, at least) is E invariant be- 

ages of 18 an i 
a aencratiration cannot be extended to 


the 60-over 75 group. All three of the ma- 
jor factors undergo an increase 1n their inde- 
pendent variance contributions (Table 6). Al- 


tors A and B continue to load the 
ee hs they loaded on the younger 
groups, Factor A picks up ee 
ment, and Factor B Digit Symbol. he a 
striking change occurs in the Factor C. 5 
Memory factor, as already noted, Sprea s 
over several new tests and comes to be inde- 
ailure of Picture Arrange- 


ion is the f 
yee he 45-54 group. Since the 


load Factor B in tl + 
it on the other two groups is weak (24 and 
22), this deviation is probably within sampling error 
22), 


expectations. 


pendently responsible for almost 10% of the 
total variance, with a concomitant reduction 
in the amount of G variance. Thus, a real 
change occurs in intellectual organization in 
the elderly, with memory playing a far more 
important role in determining individual dif- 
ferences in test performance. 

The present analysis casts some doubt on 
accepted conception of intellectual organiza- 
tion with regard to the lower end of the de- 
velopmental scale. Garrett (10) presents a 
“differentiation” hypothesis to the effect that, 
“Over the elementary school years we find a 
functional generality among tests at the sym- 
bol level. Later on this general factor or ‘g’ 
breaks down into the quasi-independent fac- 
tors reported by many investigators (10, p. 
376).” The evidence that he adduces for this 
view is largely the reduction in correlation 
among tests and factors in college students as 
compared to school children. On its face, this 
evidence is ambiguous since college students 
fall in the upper tail of the intelligence dis- 
tribution, and the patent selection for intelli- 
gence involved would lead to the expectation 
that test correlations would decline. 

In the present investigation, we find that at 
college age (18—19) as well as later, for un- 
selected samples of the population, the general 
factor accounts for a substantial amount of 
the total variance of the WAIS subtests, that 
is, about half, which in turn is about three- 
fourths of the communal variance. This does 
not leave much room for greater generality at 
younger ages. Thus, the present findings, al- 
though incomplete, cast serious doubt on the 
notion that the general factor “differentiates” 
into quasi-independent factors by the late 
teens. Whatever “differentiation” occurs does 
so around retirement age, and then “deteriora- 
tion” is probably a better descriptive term. 

As age increases over our samples, mean 
education decreases (18), and the question 
arises as to whether our results may not be 
due to education. This is considered unlikely 
on two grounds. Firstly, although education 
drops steadily between 25—34 and our oldest 
group, there is no drop in G variance between 
25-34 and 45—54; in fact, a small increase 
occurs, and between 45-54 and 60-over 75, 
a large decrease occurs (see Table 6). One 
would need to postulate a curvilinear relation- 
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Psychological Changes over a Five Year Period 
Following Bilateral Prefrontal Lobotomy’ 


Isidor W. Scherer, C. James Klett 
VA Hospital, Northampton, Massachusetts 


and John F. Winne 
Washington, D. C. 


In previous publications, Scherer eż al. (1, 


2) have reported changes in a standard bat- 


tery of psychological tests administered at 
fixed time intervals over a period of three 
years to a group of lobotomized schizophrenics 
and their equated controls. The present re- 
port extends these observations to the fifth 
year after lobotomy and attempts to arrive at 
some summary statements to describe the out- 
come of five years of lobotomy research. 


Experimental Procedure 


Subjects 
Fifty white male schizophrenic Patients at 
the Veterans Administration Hospital, North- 
ampton, Massachusetts have served as sub- 
jects (Ss) during some phase of this eo 
study. Of these patients, 28 were lobotomized, 
the operation consisting of a standard, open 
Lyerly-Poppen approach (2, p. 6) y ile the 
remaining 22 constituted the contro group. 
The Ss for the fifth year of testing conas e 
of 13 lobotomized and 12 oN ee ee 
tients, fairly well equated for age, € ua non; 
length of hospitalization, diagnostic c as S 
tion, and degree of cooperativeness. a o 
the experiment all patients had received elec- 
hock treatment, which was unsuccessful or 
Aas temporarily beneficial, and all had been 
pa s suitable candidates for lobotomy 


ted a: j 
iine medical staff of the hospital. 


included in the 

Ithough all these Ss were inc ! 
ous Cites! the groups at the different 
Fine intervals cannot be considered completely 


Veterans Administration Hospital, 


prcon iae Massachusetts. 


Northampton, 


equivalent because of the attrition due to dis- 
charge and other reasons. Presumably the 
most intact patients were the ones who were 
ultimately discharged. 


Test Material 


The battery utilized in the fifth year of 
testing consisted of 21 tests that yielded the 
40 “clear-cut measures of functional effi- 
ciency” reported in the three-year study (1). 
A description of these tests and the measures 
derived from them can be found elsewhere 


(2). 
Results and Discussion 


A summary of the changes on the 40 meas- 
ures of functioning efficiency for five testing 
periods appears in Table 1. Included are the 
direction of change within the experimental 
and control groups and the net change in the 
experimental group over the control group. 
All recorded changes are statistically signifi- 
cant at the .20 level or better, consistent with 
practice in the earlier studies. 

Examination of the five-year columns of 
Table 1 shows that the experimental group 
has gained on 16 measures and lost on only 
2 when compared with testing prior to lo- 
botomy. During the same period, the control 
group gained on 11 mèasures and showed a 
loss on 2. The net change column shows that 
at five years the experimental group has made 
significantly more gains over its preoperative 
baseline than the control group on 10 meas- 
ures with significantly less gain on 3. Of the 
10 measures showing net gain for the experi- 
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Direction of Change from Preoperative Baseline in Forty Tests of Functioning Efficiency 
Over a Five-Year Period 
Experimental Group Control Group Net Change 
ee ee ee pera 


23 4 2 5 23 13 5 23 4 3% 
Measure wk mo yr yr yr wk mo Y yr yr wk mo 


Digit Symbol 
Errors (1) Bi 
Score (2) 

Digit Span 
Forward (3) 

Reversed (4) 
Weighted score (5) 

Serial Sevens 
Time (6) 

Errors (7) 

Hard Pairs (9) 

Visual memory rights (10) 

Vocabulary (17) 

Memory Paragraph 

Immediate rights (18) 
Delay rights (21) 
Average rights (24) 
Immediate distortions (19) 
Delayed distortions (22) 
Average distortions (25) 
Immediate distorted order (20) 
Delayed distorted order (23) 

Finger Dexterity Time (29) 

eezer Dexterity Time (31) 

Downey Total Time (36) 

Halstead 
Right hand (38) 

Left hand (39) 

Ë Both hands (40) 

E Shapes (43) 
Position (44) 

Object Sortin 
Score (48) 

Confabulation (49) 
Symbolism (50) 

Similarities (51) 

Series Completion (53) 

Shipley-Hartford (55) 
ategorization (57) 

Trailmaking 
Time (58) 

Errors (59) 

D-A-W Confusion (62) 

D.A.M Accuracy (65) 
Confusion (66) 

Bender Gestalt Elaboration ( 75) 

Word Association Recall (81) 

No. of Measures Showing 
Gain 5 12 21 94 16 19 
No change 31 25 45 12 22 pa 
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mental group, 7 of them—2, 6, 10, 17, 21, 24, 
and 53—represent a significant improvement 
in the experimental group over their preopera- 
tive level with either a concomitant loss or no 
gain in efficiency on the part of the control 
group. The other three measures reflect a sig- 
nificant et improvement on the part of the 
operated patients.” 

These findings at the fifth year are further 
clarified by an inspection of the changes from 
the third to the fifth year after lobotomy. Al- 
though the earlier papers showed that the ex- 
perimental group had continued to improve 
over its preoperative level and over the con- 
trol group up to the third year, both groups 


have essentially stabilized since that time. At 


five years, the experimental group achieved a 
than it had at 


higher Shipley-Hartford score tha 
three years. During the same period, the con- 
trol group committed fewer errors on the 
Serial Sevens test and introduced fewer dis- 
tortions in the immediate recall of the Mem- 
ory Paragraph but demonstrated more con- 
fusion on the Draw-a-Woman test. In terms 
of net change, the experimental group showed 
significantly less D-A-W confusion but signifi- 
cantly more Serial Sevens errors than the con- 
trol group when compared to their respective 
three-year levels. None of the remaining 114 
comparisons reached the .05 level of con- 
fidence. 
To summarize briefly the changes over a 
five-year period that are shown in Table 1, 
both the experimental and the control groups 
have shown gains at every testing period over 
their preoperative level. In the interval from 
one to three years, the experimental group 
continued to show greater gain than the con- 
trol group, but after that time the groups 
seem to have stabilized. The eae ape 
group, however, maintained he; sa n 
gain to the fifth year. Only at t e two-wee 
testing period, when the postoperative effects 
could be expected to be most acute, did the 


2 iving, for each measure, the number of 
eens, the standard error of difference, t, and 
for the changes within the experimental and con- 
trol groups, and for the net change in the operative 
during the period from before operation to 
ae? rs after operation have been deposited with 
BC To Documentation Institute. Order Docu- 
remitting $1.25 for microfilm or 


experimental group prove to be inferior to the 
control group in terms of net gain. These find- 
ings have some implication for those who fear 
the negative after-effects of lobotomy. It ap- 
pears that either the frontal lobes are not sig- 
nificantly related to the mental functions 
studied in this report or that lobotomy does 
not interfere significantly with the function 
of the frontal lobes. 

Assessing the positive effects of lobotomy 
is a somewhat more difficult task. At the end 
of one year it was concluded (2) that: (a) 
there was a tendency towards decreased men- 
tal efficiency, possibly associated with organic 
change; (b) ego boundaries were strengthened 
at least to the third month when there were 
some indications of weakened ego boundaries; 
(c) sexual awareness and differentiation were 
increased; and (d) there was an increase in 
the rate of motoric action. Additional changes 
suggested a tendency toward lack of inhibi- 
tion in the moral-social field. At the end of 
three years it was concluded (2) that there 
was evidence of (a) increased mental effi- 
ciency; (b) strengthened ego boundaries; 
(c) continued increase in sexual awareness; 
(d) increased rate of motoric action; and 
(e) more inhibition or less impulsivity on 
tests of imagination and ideation. 

Considering the changes in the psychologi- 
cal test measures at the fifth year, it might 
be concluded that in a global sense the lo- 
botomized patients demonstrated an increase 
in functioning efficiency. However, attempting 
to account for this overall improvement in 
terms of specific ego functions or in terms of 
specific test measures leads to ambiguity of 
interpretation since there is little consistency 
in the particular measures on which it is mani- 
fested. It is also possible that the individual 
personality structure reacts to surgical trauma 
in such an individualized manner that it over- 
whelms the potential specific effects of the 
lobotomy operation. In this context, it is im- 
portant to note Winne and Scherer’s report 
(3) of a second one-year study. They found 
that of 211 specific predictions derived from 
the first one-year study, only 15 were con- 
firmed in the second study. Furthermore, over- 
all improvement of the experimental group 
was not noted in their study. 

In order to retain the more stable of the 
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Isidor W. Scherer, C. James Klett, and John F, Winne 


In the three-year report it was suggested 
that in spite of the favorable showing by the 


en a comparison was made between the 
experimental and Control subjects that are still 


the hospital although the difference 
Was not a Significant one, 


; upon periodic psy- 
chologica] testing over a five-year Postopera- 
: Forty measures of f Unctioning effi- 


year are Presented together 


S, one year, and three years post- 
he results of the latter testing 
iscussed in detail in previ- 
3 For those readers interested in disposition dia 
on the entire Series of 104 Patients lobotomized A 
this hospita] between October 23, 1947 and April 2a 
ill in this hospital, four died, and AEE 
er Veterans Hospitals. Twe d 
ill in the hospital were dischars 
€ occasion but were then readin 
aining Patients now out of the hos 
were readmitted to this hospital at kas 
once and it is, o; Course, unknown how many ha 
ed to some Other hospital, 


> 


\ 
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ous publications. The conclusions are as fol- 
lows: 

1. Although both the control group and the 
experimental group continued to show gains 
on the measures of functioning efficiency up 
to the third year, both groups stabilized be- 
tween the third and the fifth year. There was 
essentially no met change between the third 
and the fifth year. 

2. The lobotomy group was generally su- 
perior to its preoperative level and to the con- 
trol group after five years, i.e., it was able to 
maintain its gains. i : 

3. No attempt was made to attribute posi- 
tive gains to specific ego functions because of 
inconsistency of individual measures from one 
testing period to another and because of 
Winne and Scherer’s negative findings in the 


second one-year study. 
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4, Indicators of clinical improvement such 
as discharge rate do not reflect the positive 
gain shown on psychological tests. 


Received October 9, 1956. 
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The Question of Deterioration in Alcoholism 


Derwood E, Johnson 


Evansville State H, Ospital 


ate the two groups significantly (P less than 
02). But this overall 


ue to the Superior retention of personal and — 


current infor ation in the alcoholic group. 
remor, which 


alcoholic Patients, was noted on Wechsler 
Memory reproductions in this study. Marked 
tremor, indicated by obvious deviations in all 

j rawings, occurred more in 
(chi square, p less than 


The Raven Progressive Matrices (1938) 
indicated no si 
groups, 
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Performance Patterns in all three 
d Staphically and statistically. 
neurologica] deficit was inter- 

the results, Kaldegg’s conclusion 
& resemblance between his alcoholic 
and Psychoneurotic groups was reinforced bY 
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alcoholic Patients and 4 
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A Further Note on Chlorpromazine: Maze Reactions 


S. D. Porteus and John E. Barclay 
Territorial Hospital, Kaneohe, T. Hawai 


The purpose of the present article is to pre- 
sent a progress report on current research re- 
garding the effect of prolonged routine dosage 
of chlorpromazine * (300 m.g. daily for pe- 
riods from 6 weeks to 6 months) on Maze 
tested abilities of psychotic patients. Previous 
results were reported in the February issue of 


this Journal (2). 


Method 
nce of an experimental group 
of 35 cases has been compared with that of 
25 control patients who have never had the 
drug. The original or standard Maze Test was 
applied before medication, the extension or 
practice-free Maze Test after at least 6 weeks’ 
medication. The two forms of the test were 
given to the control group of untreated hos- 
pital patients with a varying interval of time 
between testings. Inspection of the data did 
not reveal any influence of elapsed time on 
the extension score. 

Dificulty in setting up controls. The well- 
known inconsistency of behavior of many psy- 
chotics may conceivably interfere with the 
comparability of two test scores obtained at 
different times. Because the Maze undoubt- 
edly reflects temperamental factors, this in- 


consistency does affect the setting up of con- 


trols. ~ i 
Investigators in the first two Columbia- 


Greystone projects attempted to overcome the 
difficulty by using only results “considered by 
the examiner to be representative efforts of a 
cooperative patient” ad, P- 184). Their sub- 
jects were already partially stabilized. We 
could not select patients in this way because 


The performa 


s and placebos used in this study were con- 
wees by the Smith, Kline, and French Labora- 
tories, Philadelphia. 


of the impossibility of determining what was 
a “representative” performance. 

Although anxiety, suspicion, restlessness, or 
aggressiveness could be expected to lower 
Maze performance, especially among the con- 
trol group, the experimental group had al- 
ready been “tranquilized,” making any de- 
cline in their scores more significant (2). 


Comparative Results 


The previous report was based on the as- 
sumption that there should be, for psychotic 
patients, the same equivalence between stand- 
ard and extension Maze scores as was evinced 
by normals (3). The use of a control group 
was intended to substantiate this assumption. 

In the experimental group (N = 35) the 
deficits in average Maze score after chlor- 
promazine amounted to 1.89 years, whereas 
the difference for the control group (NW = 25) 
between the standard and extension means 
was only — 0.1 year, thus proving that the 


Table 1 


Maze Scores of Experimental and Control Groups 


Measure Experimental Control 

N 35 25 
Standard Maze 

Mean 11.90 11.96 

SD 3.42 3.34 
Extension Maze 

Mean 10.01 11.86 

SD 3.58 3.42 
Difference —1.89 —0.10 
t 2.26* 
F 4.99* 
z% 2.21* 


* Significant at .05 level. 
a Mann-Whitney test. 
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decline in Maze score after chlorpromazine Table 3 


Maze Scores—Repeated Applications an 
i i igi tandard 
tension series. The original or s Toone A 
TA of the experimental group was 11.9, of Measure Experimenta] Contro 
the control group 11.96. The Tesults of the J er a€, 
comparisons are set forth in Table 1, 

Another approach was to compare the scores Standard Maze 


of 20 pairs of Control and experimental Cases, Mean 2.00 s 2.02 
matched exactly by original standard Scores, A 


Extension Maze 1 


Mean 10.50 11.98 
they lost 2.2 years as against a gain of 0.2 SD «i 


3.89 3.29 
year for the controls when the extension se Difference —1.50 mes? 
Mes was applied to each group, Reference to Extension Maze 2a 

the tables wil] show that in the experimenta] Mean 9.91 13.09 
group the critica] ratio of the mean difference, SD 3.61 2.95 
F ratio, and ¢ ratio obtained by —2.09 1.07 


simple analy- Difference 
t 


; . 1.92* 1.17 
Whitney (z) test, a nonparametric distribu- * Significant at os lva 

tion-free test, the results were all significant *Exten 

b el. By 

hi th 


sion series inverted, 


> i a nes, 
om the extension series Scores stil] well below the original ones, 


to 
ime, the tests being inverted Whereas the contro] 8roup was very close 
i - Table 3 shows the ceiling of the test, 


et throw no light on the 


to whether the effects of 
the drug are transitory or 


much may p; id. Since ; 
Table 2 ay be Said Since in 


h i “ À 
; Cooperative unworried individual must be í 
Measure Experimenta] Cont ANA 
Perimenta; eneg) otherwise paid for, and it may well be that i 
is 20 2% deficits in initiatj i 


are part i to be 
Standard Maze p: of the price What has proved e 
i e Case in lobotomy may well be true for 
Mean 1.85 11.85 ataractic dri Its go, 
SD 3.50 3.50 ugs. As ar as present resu ? 
the parallel between chlorpromazine and psy- ~i 
Extension Maze “10SUrgery seems lear. Notwithstanding Maze 
Mean aay 12.05 deficits, Patients treated with chlorpromazine 
Dives ee a ae are Obviously less self-concerned, less anuo 
i 2.10* 20 and less “8Bressive, and therefore bettepac 4 
Justed at S 7 
* Significant at .05 Jeva, affect claims 


i e r 
a simple socia] level. This fact aa 
for the Maze as an index of ge 


Further Note on Chlorpromazine-Maze Reactions 


“eral social adaptability. It may well be that 


adaptability is of two kinds, one in which 
freedom from inner tensions makes the indi- 
vidual easier to live with, and another in 
which planning and initiative make him more 
industrially capable. The latter may be what 
the test measures. A question to be answered 
by further research is whether the benefits as 
Well as any deficits that follow the use of 
ataractic drugs are transitory Or permanent. 


Summary 
rations of the Maze Test 


chlorpromazine, in com- 
reveal a con- 
rs. The decline 


Repeated administ 
to patients receiving 
parison with a control group, 
tinued deficit of about two yea 
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in maze scores is comparable to that shown 
by patients who have undergone lobotomy. 
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A Multidimensional Comp 


arison of Therapist Activity 


in Analytic and Client-centered Therapy’ 


Hans H. Strupp 


The George Washington University, 


“Tn science we need flexible minds and rigid 
but in psychoanalysis we have rigid 
minds and flexible concepts” (13, P- 233). 
This statement by a leading analyst, which 
is equally applicable to other forms of psy- 
chotherapy, epitomizes a growing awareness 
among research-minded psychotherapists that 
the fluidity of concepts, the ambiguities of 
language, and the idiosyncratic frames of ref- 


erence espoused by competing schools repre- 


sent serious barriers against furthering our 


knowledge of the psychotherapeutic process. 
From numerous quarters in recent years has 
come the cry for simpler concepts, for opera- 
tional definitions, and for identifying the com- 
mon denominators underlying all psychothera- 
peutic procedures. This trend implies, among 
other things, that differences in theory care 
meaningless if they fail to carry over into 
practice, and that focus upon the actual opera- 


tions may be more fruitful for testing theo- 


retical differences than prolonged controversy 


about the uniqueness of a given system. 
eutic protocols has 


The analysis of therap 
occupied the time of researchers for some 
years, but rarely has an attempt been made 
to go outside a school of thought and to com- 


1 This research is part of a larger project which is 
supported by a research grant (M-965) froni. the Na- 
tional Institute of Mental Health, of the National In- 
stitutes of Health, U. S. Public Health Service. Grate- 
ful acknowledgment is made to Winfred Overholser, 
M.D. under whose general direction this work was 
carried out, and to Leon Yochelson, M.D., project 
consultant. In addition, I am greatly indebted to my 
former research associate, Rebecca E. Rieger, A M, 
who contributed materially to the execution of this 

dy. A slightly different version of this paper was 
gee 1956 Annual Meeting of the Ameri- 


sented at the nual 1 r 
ia Psychological Association in Chicago. 


School of Medicine 


pare the techniques of, say, & nondirectivist 
with those of an analyst. Yet, such compari- 
sons will inevitably play a part in future at- 
tempts to evaluate the relative effectiveness 
of competing approaches to psychotherapy. 

This paper presents a preliminary descriptive 
analysis of two varieties of psychotherapeutic 
techniques: insight therapy with reeducative 
goals based on psychoanalytic principles, and 
client-centered therapy. The analysis is medi- 
ated by a multidimensional system, designed 
to quantify the common denominators in the 
verbal operations of therapists irrespective of 
their theoretical orientation. The data obvi- 
ously do not permit an evaluation of the re- 
spective merits of short-term analytic and 
client-centered therapy. 


The Two Case Histories 


The first case history, published by Wol- 
berg (12, pp. 688-780) ,” comprises nine treat- 
ment sessions with a retired business woman, 
a widow in the middle years of life, who had 
become progressively depressed, and retreated 
from her customary social contacts. Concern- 
ing his technique, the therapist (Wolberg) 
mentions that the work proceeded almost en- 
tirely on a characterologic level, and that the 
effect of treatment was mostly of a reeduca- 
tive nature, despite the fact that he inter- 
preted some of the patient’s defenses. A fol- 
low-up indicated that the results of treatment 
had been durable. 

The second case history is that of Mary 
Jane Tilden, counseled by Rogers in a series 
of eleven interviews (9, pp. 128-203). Un- 
fortunately, the author was not aware that 


2 This case, particularly the therapist’s activity, has 
been more fully discussed (11). 
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this case is available in its entirety, which 
necessitated the selection of reasonably ae 
lete interviews from the beginning, mi e, 
Pii terminal phases of treatment from the 
e described as a 20-year-old, 
attractive young woman brought to the clinic 
by her mother, who complained that the pa 
tient was sleeping all the time, brooding, an 
ruminating. Miss Tilden seemed to be with- 
drawing progressively—she had given up her 
j lost interest in 
emal treated by nondirective therapy. 
Rogers felt that the 


mate). These 
characterized as 


00 Facilitating Communicat 
ity). 
Exploratory Operations. 
Clarification (Minimal interpretation), 
Interpretive Operations, 
Structuring, 
Direct Guidance. 
Activity not clearly rele 
therapy, 
70 Unclassifiable, 
Sixteen subcategories served to refine the Primary 
rating. 
Degree of Inference. 
on the conception that 


ion (Minimal activ- 
10 
20 
30 
40 
50 


60 vant to the task of 


This intensity scale was based 
inference is an integral part 
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Dynamic Focus. Dynamic Focus referred to E 
frame of reference adopted by the therapist at 
juncture, and characterizes the manner in 

tic 


categories: 
B-1 Requests for additional information. 
B-2 Focus on dynamic events in the past. 
B-3 ocus on dynamic events in the present. t 
B-4T Focus on dynamics of the therapist-patien 
Telationship (analysis of the transference). 
B-4 


Focus on the therapist-patient interaction 
in terms of the therapist’s role as an eX 
pert, authority, ete, 5 


low to high, and ratings 
As in the 


scale points were defined by 
examples, 

„Therapeutic Climate. Emotional Overtones discer- 
nible in a communication were quantified by me 
of a bipolar scale: 0 = neutral; + 1 = mild degree © 
warmth; + 2 Strong degree of 


Bree of coldness; i} Strong 
i a Re 
* communication js one in 


draws support, or Punishes. 


Tilden case 
from the pri 
interviews 


were scored jointly by two raters 
nted scripts. Two of the Wolbar 
rated independently by t í 
tain a measure of rater agree 


Results 
Rater A gree: 


Table 1 presents results based on a ant 
Y-unit analysis of two interviews scored tes 
dependently by two raters, Agreement het 
unit (therapist communication) means t ry 
oth raters assigned it to the same catego 


ment 


-> 


= 
bz 
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Multidimensional Comparison of Therapist Activity 


Table 1 


Agreement Between Two Independent Raters®* 
—— U S 


Wolberg Wolberg 
Interview VII Interview IX 
System Component (N=114) (N=154) 
Type 80.7% 80.5% 
Degree of Inference 86.0% (r= 86) 94.0% (r= 885) 
Dynamic Focus 80.7% 85.7% 
Initiative 87.7% (r=.87) 93.5% (r=-93) 
Therapeutic Climate? — — 


a All percentages and correlation coefficients are significant 


beyond the .01 level. 
E Nonzero scores too infrequent. 


(on Type and Focus, respectively), or that 
they gave it an intensity score (on Degree of 
Inference or Initiative, respectively) no more 
than one-half step apart. For the last two 
scales, product-moment coefficients were com- 
puted in addition. 


The Wolberg Case 
The therapist’s activity, as mirrored by the 


multidimensional system of analysis, is pre- 
sented in Figures 1, 2, 3, and 4. Within each 


interview, 


frequencies have been converted 


li n 


PERCENT 
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into percentages. In the case of Degree oj 
Inference and Initiative, the designation Level 
1, 2, and 3 signifies that scores have been 
grouped; Level 3 refers to the most intense 
scores. Chi squares computed for each com- 
ponent of the system were significant beyond 
the .01 level, indicating that the fluctuations 
in therapist activity for the interview series 
are not attributable to chance. 

The therapist’s techniques show systematic 
variations on all components over the course 
of therapy.’ The initial interview is devoted 
largely to an exploration of the patient’s prob- 
lem; the next two interviews reveal an intensi- 
fication of therapeutic activity, both in terms 
of inferential operations and Initiative; Inter- 
views IV and VII emerge as interpretive ones, 
the intervening sessions as less “dramatic”; 
data for the remaining sessions point to a 
phasing out of interpretive activity, but Initia- 
tive remains at a relatively high level. 

The therapist’s interpretations are geared to 
the patient’s current interpersonal relations, 
with relatively little emphasis on the thera- 


4 Therapeutic Climate had to be omitted because 
there were very few nonzero scores. 


MISCELLANEOUS (40 AND 60) 


DIRECT GUIDANCE (50) 


INTERPRETATION (30) 


CLARIFICATION (20) 


EXPLORATION (10) 


MINIMAL AOTIVITY (00) 


INTERVIEWS 


Fig. 1. Therapist activity in the Wolberg case in terms of Type of Therapeutic Activity. (Interviews: 


I, N= 108; 


1, N=79; MI, N=108; IV, N=174; V, N=123; VI, N=85; VII, N=114; VII, N= 


130; IX, N= 154. Total number of therapist interventions: N = 1,075.) 
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INTERVIEWS 


Fig. 4. Therapist activity in the Wolberg case in terms of Initiative. 


reeducative goals” appears to be corroborated 
by the quantitative analyses. i 

The most noteworthy single result is per- 
haps the phasing of therapeutic activity. It 
seems as if the therapist gradually prepares 
the patient for more inferential formulations 
which he advances in the fourth session. Then 
he waits for a consolidation of insight before 
renewing his interpretive efforts in Interview 
VII. Thereafter, he diminishes his interpretive 
activity while maintaining a degree of thera- 
peutic pressure till the end. 


DIREGT GUIDANCE (50) 
MISCELLANEOUS (40 AND 60) 
INTERPRETATION (30) 


PERCENT 
a 
[e] 


CLARIFICATION (20) 


XPLORATION (10) 


NIHAL AGTIVITY (00) 


INTERVIEWS = 


Fig. 5. Therapist activity in the MeH case 
in terms of Type of Therapeutic Activity: (hter- 
views: I, N=57; V, N=23; XI, N=53, Total 
number of therapist interventions: N ='133.) 


The Case of Miss Tilden 


The analysis comprises three selected inter- 
views; they are, however, separated in time 
and they presumably represent different stages 
of therapy. 

Reference to Figures 5, 6, 7, and 8 indi- 
cates that the profiles of therapist activity are 
quite similar from interview to interview. As 
might be expected, reflections of feeling ac- 
count for a large percentage of all interven- 
tions (75%); interpretations are virtually 
absent; explorations are used minimally in 
the initial session and are almost nonexistent 
later on; direct guidance is equally rare. The 


PERCENT 


LEVEL | 


u Ednt, “sv. Research | 


HANG 69.5556 INTERVIEWS 


Fig. 6. Therapist. activity in the Miss Tilden case in 
terms of Degree of Inference. 
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‘data on Degree of 


interventions, the th 


Inference and Initiative 
Corroborate these findings: neither Maximal 


tient’s focus; only very rarely do 


the role of an expert 


Were selected and Ci 
-Tilden case, Since 


PERCENT 


PERCENT 


INTERVIEWS 
Fig. 8. Therapist activit: 
terms 


y in the Miss Tilden case in 
of Initiative, 


categories within Type and Dynamic Focus 
vary so 8teatly fo 


r the two therapists, the 
only meaningful co 


™parisons concern the ona 
tinua of “Sree of Inference and Initiative. 
he results of this analysis are Presented in | 
Table 2, 


iews, but not in the initial oni 
ing is accounted for by the fag 
mploys a great many explora 


diagnostic character in his 
rst session Which i 


Table 2 
Chi-Square Comparisons of Therapist Activity in 
Thitial, Middle, and Terminal Interviews 
Wolberg T Wolberg IV Vee 
=108) (w=174) (v=1 
Ds. vs, ht xI 
Rogers I Rogers V poaa 
(V=57) (N=23) (N= 
Degreeof Inference 19.32 9 304+ 4.66% 
itiative 


19 22.79  9.85** 


#5 the .02 and .0S level, 
$ Significant t the OF level, j 
1 Signis, cant at the .001 a 
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Discussion 


A multidimensional system of analysis has 
been applied to the therapist’s communica- 
tions in two forms of therapy in an effort to 
Measure aspects which may be common to 
both. With respect to the Miss Tilden case, 
the system of analysis yields data which are 
substantially in agreement with other analy- 
ses which have been performed on interviews 
conducted by nondirective counselors. By and 
large, these results also agree with Rogers’ 
recommendations on therapeutic technique. 
Wolberg’s technique, too, is in agreement with 
his descriptive account but, to my knowledge, 
no comparable quantitative studies have been 
published. While not crucial, such evidence 
attests indirectly to the validity of this sys- 
tem of analysis. Of at least equal importance, 
however, is the tentative demonstration that 
the method facilitates the comparative treat- 
ment of therapeutic techniques—a treatment 
which is quantitative and highly objective, 
and which does not prejudge a particular com- 
munication as desirable or undesirable on a 
priori grounds. 

To be sure, the present two case histories 
are comparable only in superficial respects 
and they do not lend themselves to a rigorous 
evaluative comparison. However, they suggest 
a number of questions which appear to be 
basic to all psychotherapy research. Consider 
the following two points. 

We know that both patients entered psy- 
chotherapy seeking alleviation of their emo- 
tional problems. Did their difficulties have any 
common basis? What was the relative degree 
of their disturbance? Even if both had been 
diagnosed as “depressed,” OT by any other 
label, we would know but little about the 
common denominators of the underlying dy- 
namics. As Kubie (6) has pointed out, the 
time is ripe for fresh attempts to identify the 
common principles of the “neurotic process.” 
It is clear that studies in which patients are 
matched with experimental “controls” remain 
largely meaningless unless this Herculean re- 
search task can be accomplished. 

Secondly, what transpired in the thera- 
peutic sessions that led both therapists to 
evaluate the outcome as “successful?” Both 
therapists are highly experienced men in their 
field; both had a rationale for their respec- 
tive procedures which on the evidence of this 
study differed quantitatively (Degree of In- 
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ference and Initiative) and perhaps qualita- 
tively (Type and Dynamic Focus). Rogers, 
in keeping with his theory, consistently re- 
flected the patient’s feelings, whereas Wol- 
berg, combining analytic principles with re- 
educative techniques, attempted to effect 
therapeutic changes in his patient mainly by 
means of interpretation and guidance, But 
even if the patients could be equated it would 
not be possible to attribute differences in 
therapeutic outcome (whose measurement is 
another staggering problem) to variations in 
technique as long as relevant factors in the 
therapist’s personality are left out of account. 
Certainly, Wolberg was more “directive” (by 
Rogerian standards). But both therapists con- 
veyed an attitude of respect for their patients 
and implied their right to self-direction; both 
appeared to be warm, accepting, and non- 
critical; both encouraged the patient’s expres- 
sion of feelings; and both, by their thera- 
peutic performance, seemed to engender a 
feeling of greater self-acceptance in their pa-. 
tients. These attitudes on the part of the 
therapist—he may have them in common with 
the mature person who can also be a good 
parent *—are as yet largely unexplored by 
objective research, but they may be the touch- 
stone of all therapeutic success, regardless of 
the theory. Given the “basic therapist per- 
sonality” it may still be possible that some 
therapeutic techniques or combinations of 
techniques catalyze the therapeutic process 
whereas others are relatively inert; contrari- 
wise, no amount of training in technique may 
compensate for deficiencies in the therapist’s 
“basic attitudes.” To approach these prob- 
lems by research is difficult, but by no means 
impossible. 
te have in mind Fromm’s “productive character” 
6 There is increasing evidence that the therapist’s 
attitude may “cut across” theoretical orientations. 
For a comprehensive statement of the client-centered 
position, see Rogers’ discussion (8, pp. 19-64). On 
the other hand, Wolberg’s transcript offers evidence 
that respect for the patient, his capacities, his right 
to self-direction, and his worth as a human being 
can be conveyed even when the therapist makes in- 
repe aeons Fiedler’s studies (2, 3, 4) suggest that 
experts,” irrespective of whether they subscribe to 
the analytic, Adlerian, or client-centered viewpoint 
create highly similar “ideal therapeutic relationships” 
but, as Bordin (1, pp. 115-116) has pointed out, 
Fiedler’s findings cannot be regarded as evidence for 


or against the question of the importance 
= e to = 
tached to differences among theories, “oo 


schools Should lead to more definitive studies 
Summary o; 


erapist’s Personality, particularly of 
In an effort to compare the therapist’s ac- those attitudes which 


» Wittingly or ma 
tivity in two forms of Psychotherapy, a multi- tingly, he brings to bear upon the therapeu 
dimensional System for analyzing therapeutic interaction, 
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Improvement and Amount of Therapeutic Contact: 
An Alternative to the Use of No-treatment 
Controls in Psychotherapy’ 


Stanley D. Imber, Jerome D. Frank, Earl H. Nash, 
Anthony R. Stone, and Lester H. Gliedman 
The Johns Hopkins University School of Medicine 


One of the most formidable problems for 
researchers in psychotherapy is the use of 
Control groups for the evaluation of treatment 


. effects. The difference in results obtained from 


a no-treatment control group and an experi- 
mental (treatment) group should provide the 
crucial test of the efficacy of a psychothera- 
peutic method. For both ethical and practical 
reasons, however, the utilization of this para- 
digm has in general not proved feasible. From 
an ethical point of view, it is not considered 
proper to deny a sick patient available thera- 
peutic facilities, even for the sake of scientific 
certainty. In this light, it may seem incon- 
sistent that no-treatment controls are appear- 
ing with increasing frequency in pharmacologi- 
cal (as well as electro-shock and lobotomy) 
studies although there remains a relative 
dearth of these controls in studies of psycho- 
therapy. This is true despite the fact that 
both drug and psychotherapy investigations 
may use similar nonhospitalized patient sam- 
ples, with comparable diagnoses and levels of 
illness. There are, however, important differ- 
ences in the conditions under which studies in 
these two areas are conducted. To begin with, 
there is the matter of the length of the con- 


1This study is part of a larger research project 
supported by Research Grant No. M-532 (C-2) from 
the National Institute of Mental Health, United 
States Public Health Service. 

2 There are exceptions to this generalization, e.g., 
the work of Rogers and his colleagues (6) and Bar- 
ron and Leary (1). The great preponderance of psy- 
chotherapy studies, however, have no control pro- 
visions at all, at least in the sense in which we are 
concerned with control design in this paper. 


trol period. Where a drug is being tested, the 
control patient ordinarily is denied therapeu- 
tic intervention for only a few weeks, rarely 
more than a month. This relatively brief time 
interval is considered sufficient for evaluat- 
ing the potency of the drug, and hence, the 
control period is adequate. Unfortunately, the 
efficacy of psychotherapy cannot be assessed 
in so short a time. Psychotherapy controls 
would have to undergo a no-treatment period 
of at least six months, possibly much longer, 
if they are to match the average time most 
patients spend in treatment. Few psychothera- 
pists are willing to expose patients to this 
condition, if some treatment form is available. 
Furthermore, there is the very practical 
problem of retaining control patients in the 
experimental design. Should a large number 
of patients deliberately withdraw or drop out 
prior to completion of the experiment, a 
biased sample may remain which differs in 
important characteristics from that popula- 
tion which was the original focus of investiga- 
tion. The likelihood of a large dropout is much 
greater in psychotherapy than in drug studies, 
primarily because of the differential in con- 
trol time described above. It is a rare case 
where a distressed patient is willing to wait 
six months or longer for professional help, as 
would be the situation for the therapy con- 
trol. Moreover, the motivation and degree of 
illness of the patient who does accept this 
limitation become a matter of conjecture. 
There is still another distinction between 
drug and therapy controls which bears com- 
ment. Theoretically, a control patient should 
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have no contact with the clinic or treatment 
source during the control period since even 
brief contact may be equated with treatment. 
This restriction is frequently violated in drug 
studies where control patients sometimes re- 
ceive placebos and usually maintain clinic con- 
tact over the control period, at least for “pre- 
scription renewal” purposes. These patients 
are given to understand and believe that they 
are being “treated.” Furthermore, there is 
evidence that placebos alone can effect sig- 
nificant changes in the physical and emotional 
status of patients (7). It is a moot question, 
therefore, whether placebo patients may be 
accurately described as receiving “no treat- 
d, no-treatment con- 
ordinarily have no 
are fully aware that 
tmal treatment, Un- 


r use is more often 
challenged on ethical grounds, 


tice for evaluating electro 
Procedures in the same Settings, This discrep- 
ancy may have something to do with the fact 
that, compared to the latter two types of 


where more “con- 
e being examined, 
ns used in typical 
studies of shock and lobotomy are hampered 
For example, although 
studies are excluded 
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from a clearly defined experimental condition, 
strictly speaking they are not no-treatment 
controls since usually the patients are eX 
Posed to other treatment influences, e.g., the 
general therapeutic atmosphere of the ings 
pital plus occupational therapy, recreationa 
therapy, or other special activities. 

It seems, then, that although the use of no- 
treatment controls would improve the elegance 
of experimental] designs, their application t0 
the evaluation of Psychiatric treatments has 
encountered serious ethical and practical ob- 
stacles. These obstacles have been particu 
larly prominent in investigations of psycho; 
therapy where, as a consequence, no-treatmen, 
controls are used only infrequently. The moy 
common type of design in current psycho 
therapy studies consists of the comparison i 
the pre- and Posttherapy status of a pie 
treatment group or the comparison of two H 
more groups, differing in techniques or met a 
ods of treatment. One method or techniq" 
may be found superior to the others althous 
all of them may produce change in patients, 
at least from their initial baseline. The faili 
to include Control groups, however, leavê 
crucial questions unanswered. ir- 

ome investigators have attempted to ¢ 


8ories, in terms of th 
“Dropout” 
“Wait-list? Patients, 


1. Dropout Patients are those individual 
who applied for treatment but either delibe 
ately, or for reasons beyond their cont!’ 
did not keep any treatment appointments, , - 
terminated treatment very early, usually v 
out the approval of the therapist. The flaw als 
volved in the use of these patients as contro 
's that they represent an obviously soli 
lected sample Whose motivation for help, n 
a behavioral basis alone, is quite dina 
from patients accepting and receiving aly 
ment. In the dropout sample there are uae 
a few patients who are unable to keep ne 
ment appointments for “accidental” read 
Seemingly unrelated to any otherwise nee ne 
attitudes toward treatment. Closer era Tipat 
tion of the reasons offered often reveals jen 
these reasons simply have provided cone a 
OPportunities for avoiding and rejecting 


e populations used: C 
Patients (or terminators) and 


ee e o 
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ment (3). In short, these patients generally 
differ from treated patients in one significant 
respect, their unwillingness to accept treat- 
ment. In the case of those few patients whose 
reasons for rejecting treatment appear quite 
plausible (e.g., family suddenly moved to an- 
other state), the unavailability of the patients 
for treatment generally means that re-evalua- 
tions of such patients at later periods are also 
impossible. 

2. Most treatment centers are unable to 
offer therapy immediately to new patients 
and, except for acute cases, routinely require 
wait periods of varying lengths before the be- 
ginning of formal treatment. In using these 
wait-list patients as controls, the investigator 
has the advantage of the availability of indi- 
viduals presumably not different from those 
presently in treatment but who, for reasons 
not determined by themselves, must undergo 
an interval of time without access to formal 
treatment. A study utilizing these patients 
must make certain that the wait-list repre- 
sents an unbiased sample and is reduced sys- 
tematically, i.e., that patients are not assigned 
differentially from the wait-list to a par- 
ticular kind of psychotherapy or a particular 
therapist. It is not uncommon for patients 
with more acute symptoms, or with especially 
“interesting” or unique problems, to receive 
priority in assignment, thus making the wait- 
list a repository of “superficial” or unwanted 
cases.’ 

Recently, the rather ingenious device of 
“own-controls” has been adopted to obviate 
the difficult matter of matching control and 
treated groups (4). Here the changes in a pa- 
tient during the wait interval are compared 


with the changes in that same patient during . 


treatment. The one serious drawback in this 
approach is that the wait period almost in- 
variably is shorter than the treatment period 
itself. The lack of equivalence of the two 
periods reduces the validity of comparisons. 
Aside from the temporal inequality, the own- 
control design has another shortcoming, which 

3 Grummon, for example, placed a client “in the 
own-control group only if it seemed that waiting 
was not likely to cause him serious discomfort or 
harm ... (and) assignment to the wait group was 
occasionally changed to the no-wait group if... 
the client developed anxiety during the waiting pe- 
riod . . .” (4, p. 46). 
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is shared by all wait-list approaches: the 
largely undetermined influence of the promise 
of treatment on the patients’ status (1). The 
knowledge that treatment is in the offing may 
work to improve or worsen the condition of 
some patients, but to what extent, if at all, is 
a question that only recently has received 
some attention (6). 

Because of the many complicating factors 
found in the efforts to use traditional and even 
modified controls, Watterson (8) has sug- 
gested another possible solution. “When we 
test the efficacy of a given drug, we give 
tablets of the actual drug to the experimental 
group and control group. Neither the thera- 
pist . . . nor the patients themselves know 
which are the real and which the dummy pills. - 
It is possible and logical to think about psy- 
chotherapy in a parallel way; the patient re- 
ceives a unit of supposed treatment, but this 
unit may contain the necessary ingredients or 
it may not. There is nothing fanciful or un- 
usual about such a point of view. We are quite 
used to judging a kind of psychotherapy as 
being likely to succeed or to fail because it 
contains or fails to contain this or that in- 
gredient” (8, p. 239). Watterson adds that a 
test of this approach will require the precise 
statement of hypotheses relative to the signifi- 
cant elements in the treatment and consequent 
changes in the patient. In this way, testable 
predictions may be made concerning patient 
change as a function of some specific element 
or technique. Experimental and control groups 
would be used, but neither patient nor thera- 
pist would be aware of what hypotheses were 
being tested. The study reported herein serves 
as an illustration of this alternative design 
suggested by Watterson. 

Previous unpublished data accumulated by 
the authors in a series of pilot studies indi- 
cated that the amount of contact between pa- 
tient and psychotherapist is an important fac- 
tor influencing improvement rate. Because this 
factor appeared so consistent over several cri- 
teria, and seemed to account for so much of 
the variance between improved and unim- 
proved patients, it was felt that it should be 
controlled in the next sequence of studies. In 
the present experiment, therefore, a group of 
patients was formed in which patients were 
to have minimal contact with their therapists, 

2 
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in contrast to the more traditional omie: 
The limitation was imposed in terms of o 
number of appointments over a period of oo 
and the amount of time which each of t on 
separate appointments consumed. The a 
hypothesis to be tested was that patients who 
have fewer and briefer sessions of psycho- 
therapy will show significantly less improve- 
ment in the effectiveness of their social rela- 
tionships than patients with measurably more 
and longer psychotherapeutic sessions, over 
approximately the same period of time. 


Procedure 


A total of 54 psychiatric patients * who ap- 
peared at the Henry Phipps Psychiatric Clinic 
Outpatient Department between June, 1953 

1954 were included in this study. 
Most patients were Psychoneurotic, diagnosed 


is, or mental deficiency were ex- 
cluded. The Patients were assi 
three different f 


Py in which 
a week; 

in the Continued 
Phipps Outpatient 
seen individually 
ery two weeks, 


Three psychiatrists Participated in t 
They were in the 


dency, and had ap] i 
perience in both group and individual treat- 


+The total sample actually was 
those patients (N = 28) who had t 
chotherapy sessions were excluded 


91 patients, but 
hree or less psy- 
from the present 
le of 54 patients 
ons. To make al- 
(arbitrarily set at 
patients were assigned ini- 
63 patients were treated in 
the program. Nine patients were omitted from the 
present analysis to balance the design and 


The major condition for 
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ment. Each psychiatrist treated ‘a ros H 
eighteen patients, six in each of the rA 
forms of treatment. Assignment of patients de 
form of treatment and to therapist was m oe 
on a random basis by the research. oat 
Neither patient nor psychiatrist, there er 
could influence either choice of treatmem, 
choice of therapist (or Patient). Patien ino- 
the three treatment forms showed no since 
cant differences in age, sex, diagnosis, § t0 
class, length of illness, or marital statusi 
Prior to treatment assignment, each phg 
was given a structured interview by -titio 
search psychiatrist, and the interview Wi 
observed through a one-way screen by they 
search psychologist. The focus of the eee 
view was the Patient’s day-to-day rela “his 
ships with each significant individual in ‘a 
life (e.g., Spouse, siblings, children, parsa s 
boss, co-workers, male and female peers, ia 
The extent of the patient’s ineffective be eo” 
ior with each individual during the four 
period immediately Preceding the ee e- 
was rated on a six-point scale in each ° ally 
categories: Overly-independent, superset ) 
Sociable, extrapunitive, officious, impu re 
hyper-reactive, overly-systematic, overly on- 
pendent, withdrawn, intrapunitive, ire 
sible, Overcautious, Constrained, unsystem sy- 
and sexual adjustment. The interviewing r 
chiatrist and the Psychologist-observer a i 
Separate ratings on the patient, and pe A 
conference, after comparing differences p ed 
ings,’ completed a series of joint ratings iscus 
on the consensus of their conference di cte 
sion. A similar interview was also condu 
with a relative (or close friend) of eac ec- 
tient by the research social worker and a imi 
ond Psychologist-observer, the interview ? iye 


-offeci 
arly focusing on the patient’s ineffe 
social behavior, 


interviewer and 


enced, resulting in a set of joint wine 2 
nally, the two interviewing teams met Ee E 
their accumulated information and ™ 15 
single final Series of ratings in each of sW 
categories, The ratings for all categorie 


e 
4 ect! 
summed, resulting in a total Social Ineff 

ness score, 


rre- 
” ter CO 
© Reliability studies indicate that inta ffectiV e 
lation Over the 15 categories of the In 
Scale is approximately .69. 


jå 
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_ The experimental design required re-evalua- 
tion of all patients six months after treatment 
commenced, again after twelve and twenty- 
four months, and periodically thereafter. Data 
from the first (six-month) re-evaluation is re- 
j. tted in this paper. 
otl. 
for Results 
tionable 1 indicates that up to the time of the 
imp: re-evaluation, all three categories of pa- 

2.s had been in treatment over approxi- 
offer ly the same period of time (from 5.0 to 
and, 1onths),° but there was a difference be- 
waitn them in terms of average number of 
gim apeutic sessions attended over this pe- 
wal. The number of sessions for group pa- 
U ats was 15.8, for individual patients 17.7, 
and for CTC patients only 9.3. On the aver- 
age per month, then, group patients had 3.2 
sessions, individual patients 3.5 sessions, and 
CTC patients 1.7 sessions. In short, over the 
ene period of time patients in individual 
yeatment and those in group treatment had 
+ oroximately twice as many therapeutic con- 
tacts as patients assigned to the Continued 
Treatment Clinic. 

Patient change (or “improvement”) was 
measured by the algebraic difference between 
the initial (pretherapy) Ineffectiveness scores 
and the first re-evaluation Ineffectiveness 
scores. These difference or change scores were 


6 A few patients in each type of treatment termi- 
nated before the 6-month evaluation, reducing the 
mean number of months of treatment in each cate- 
gory to less than six. 
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Table 1 


Mean Number of Months and Sessions of Psycho- 
therapy for Group, Individual, and 
CTC Patients (V = 54) 


Treatment Months Sessions 
Group 5.0 15.8 
Individual . SA 17.7 
CTC 5.5 9.3 


analyzed to determine the relative efficacy of 
the three forms of therapy, the influence of 
the three psychiatrists, and the importance of 
the interaction between therapy and psychia- 
trist. As a statistical control for certain dif- 
ferences between types of treatment due to 
differential dropout rates, and for a small cor- 
relation between initial and change scores, the 
analysis of covariance was applied. The re- 
sults summarized in Table 2 indicate a highly 
significant difference between types of treat- 
ment. Group patients improved more than 
CTC patients (p < .01), and individually 
treated patients also improved more than 
those treated in CTC (p < .05). No differ- 
ence was found between group and individu- 
ally treated patients. There was, however, 
some slight evidence of a difference in the in- 
fluence of the psychiatrists (p < .20). 


Discussion 


f The hypothesis that fewer and briefer ses- 
sions of psychotherapy reduce improvement 


Table 2 


Analysis of Covariance of the Ineffectiveness Scale Scores for Treated Patients (V = 54) 


Source df =x? ZXY 2Y? Adj. ZY? df Adj. MS F 
Therapist 2 10048 4283 169.00 15165 2 7583 219 
Therapy 2 615.82 — 14.12 294.70 354.79 2 177.40 Su” 
Interaction 4 39.63 —13.10 92.60 105.81 4 26.45 
Within 45 2309.33 785.83 1827.00 1559.59 44 35.44 

A 53 3065.26 801.44 2382.30 ? 

Pooled Error 49 2348.96 772.73 1919.60 1665.40 48 34.70 
Therapist + Pooled Error 51 2449.44 815.56 2088.60 1817.05 50 
Therapy + Pooled Error 51 2964.78 758.61 2214.30 2020.19 50 
be 01. 
+b <20 


20. 
Note.—X and Y refer to Initial and Difference scores, respectively. 
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has been confirmed within the framework of 
the paradigm suggested by Watterson. An as- 
sumed “necessary ingredient” of treatment 
was diluted or restricted for certain patients 
whose improvement rate was thereby ad- 
versely affected. A step by step verification of 
other supposed necessary ingredients of psy- 
chotherapy might well proceed in analogous 
fashion. 


Specifically, the present study indicates that 
reduction of 


s Sgestive rather 
; Since the genera] background, 


=o 
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efficacy of the therapist 
(7, p. 301). It was sug- 
gested that the adequate evaluation of any 
treatment form requires that it be compared 


rectly, if at all. The conviction with which the 

i the different types of treat- 
ment and the therapists was not investigated. 
Yet there appears to be no compelling reason 
why patients should have the least faith iP 
the CT treatment. To most patients who 
rome to a public clinic, this kind of brief ast 
termittent contact With a physician is similat 
to their other medical experiences, and pre 


When they requested Psychiatric help. At least 
initially, therefore, there seems little reason t° 
consider the CTC patients as not having co™ 

ence in the treatment form to which they 
Were assigned. On the other hand, the there, 


However, if we 
out 


Cannot be said 

attribute, Other studie: na 
s by the authors } 

shown that the dropout rate for the pala 

in group therapy was appreciably higher H 

for Patients in CTG (2). H 

Finally, it might be Suggested that numba 


of therapeutic contacts are es 
Selves crucial for instilling faith in a thera? 


That is, a patient’s com 
: be helped is not nec nt 
sarily Something that he brings to treatm? i 
: but is a kind of confidence that 
built up thr ough fairly intensive and freq 

frapeutic contacts, Hence, the inferiority 
the CTC treatment might simply reflect e 
less amount of faith of patients asia 
thereto, as 4 Consequence of their lim! i 
therapeutic Contacts, If this interpretation 
Correct, the experiment may be said to oe = 
firm the validity of the concept of placebo €- 
fect as it Operates in psychotherapy. 


— + 
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Summary 


The unavailability of nontreatment control 
groups to test the efficacy of psychotherapy 
and dissatisfaction with the use of dropout 
and wait-list groups as substitute procedures, 
has prompted the use of an alternate experi- 
mental design. This design requires the pre- 
cise statement of hypotheses relative to the 
Presence or absence of an assumed significant 
element of treatment and the consequent 
changes in patients. Adopting this scheme, the 
Present study specified that patients having 
fewer and briefer sessions of psychotherapy 
will show significantly less improvement than 
Patients with more and longer sessions, over 
the same period of time. Fifty-four psychiatric 
Patients were assigned at random to three psy- 
chiatrists, each of whom treated an equal 
number of patients in group therapy and two 
different forms of individual therapy. In one 
of these latter forms, the patients were able to 
have only one-half as many psychotherapy 
sessions and the sessions lasted only one-half 
as long as patients treated in the other two 
forms. Over a six-month experimental period 
the patients with restricted therapeutic con- 
tacts showed less improvement on the criterion 


315 


of change used. The significance of amount of 
therapeutic contacts is discussed. 
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A Comparison of “Remainers” and “Defectors” 
Among Child Clinic Patients! 


Eugene E. Levitt 


Illinois Institute for Juvenile Research 


Studies by Rubinstein and Lorr (3) and 
Hiler (2) have indicated that certain back- 
ground data and test results distinguish adult 
patients who break off therapy Prematurely 
from those who continue, The present study 
is an investigation of this Phenomenon with 
child patients. The “defector” sample was a 
group of 208 cases who were accepted for 
therapy at a child guidance clinic, but who 
failed to appear for any treatment interviews, 
The “remainers” were 132 cases in which 
either the mother or the child had had at 
least 20 treatment interviews, 
available for 61 vari 
Concerned objectiy 
like race, sex, men 
dealt with the child’s health an 
ing history, 10 concerned symptoms and prob- 
lems, parental descriptions made up 9 vari- 
ables, 15 more dealt with parental handling 
of the child, 4 variables were obtained from 
projective test protocols, 


including Severity of 
the disturbance and Prognosis, and there were 


3 miscellaneous variables, With the exception 
of mental age, grade Placement, and age at 
first examination, all variables were discrete, 
Analyses of differences between groups were 


1An extended report of this study may be ob- 
tained without charge from Eugene E. Levitt, Insti- 
tute for Juvenile Research, Chicago, Illinois, or for 
a fee from the American Documentation Institute, 
Order Document No. 5182, remitting $1.25 for mi- 
crofilm or $1.25 for photocopies. 


3 
therefore accomplished by chi squares. The 


continuous variables were analyzed by £ ea 
Thirty-four, or 56%, of the analyses nod 
values of .50 or higher, and 26, or 43%, ni 
values of .70 or above. Only 8 reached beyo F 
the .20 level, 5 of these attaining significar 
at the .05 level or beyond. The 5 variables n 
not seem to hang together in a logical fash i 
nor does there appear to be any theore ai 
reason to expect them to be diferentiat | 
In general, chance significance is suge ai 
Using the Brozek-Tiede approximation f 6! 
the probability of finding 5 analyses © ery 
Significant at the .05 level by chance is Mees 
nearly .25. The hypothesis of chance sig? a 
cance therefore seems tenable, and we an 
reasonably conclude that the remaing re- 
defector child patients do not differ wit 
spect to the 61 variables analyzed herem- 
Brief Report. 
Received January 15, 1957, 
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The Effectivenessof Group Psychotherapy with Chronic 
Schizophrenic Patients and an Evaluation of 
Different Therapeutic Methods’ 


Ralph G. Semon 
VA Mental Hygiene Clinic, Lowell, Mass. 


and Norman Goldstein ° 
Psychological Counseling Center, Brandeis University 


Since 1921, when the use of group psycho- 
therapy with the various types of psychoses 
was first reported (6), treatment by this 
method has shown a steady increase (4). De- 
spite general agreement about the value of 
group therapy in the treatment of chronic 
schizophrenic patients, however, research in 
this area has not kept pace with the clinical 
use of the technique. There have been only 
two objective studies, and of these only Sacks 
and Berger (11) published definitive results 
with statistical controls. They reported posi- 
tive changes in the intrahospital behavior of 
chronic schizophrenic patients after one year 


1This study was conducted at the Boston State 
Hospital on whose staff the authors were employed 
at the time of the investigation. It was supported in 
part by a Research Fellowship from the National 
Institute of Mental Health, U. S. Public Health 
Service. 

This study is part of a doctoral dissertation com- 
pleted at Boston University in 1954, under the gen- 
eral direction of Drs. Chester C. Bennett, John Ar- 
senian, Austin W. Berkeley, and Nathan Maccoby, 
to whom the authors are indebted for their encour- 
agement and guidance. 

The authors wish to express their appreciation to 
Walter E. Barton, M.D., Superintendent, and to the 
staff of the Boston State Hospital, for their coopera- 
tion and assistance in this research. 

Special acknowledgment goes to the late Robert S. 
Johnson, M.D., for his assistance in the selection of 
the patients and for the facilitation of the work of 
the authors on the male chronic service of the Boston 
State Hospital where he was Senior Psychiatrist at 
the time of the study. 

2Now with the Psychiatry Service, Beth Israel 
Hospital, Boston. 


of group therapy. Powdermaker and Frank 
(10) evaluated the effectiveness of group ther- 
apy with a similar group of psychotic patients. 
While their findings were not statistically sig- 
nificant, the authors were enthusiastic about 
its potential value. 

Another problem in need of further evalua- 
tion is the relative efficacy of different meth- 
ods of group treatment. Slavson (14), and 
Meiers (9), in their comprehensive analyses 
of current trends in group therapy, discuss 
the development of two major orientations. 
These have been described as “therapy in a 
group” and “therapy through a group” (1, p. 
343). Bovard (2) uses the analogous terms 
“Jeader-centered” and “group-centered,” to 
distinguish methods used with nontherapy 
groups. The leader-centered method, or ther- 
apy in a group, assumes that the therapeutic 
potential is resident in the relationship formed 
between each member and the leader with the 
result that the focus of treatment is on the 
individual within the group. In the group- 
centered method, or therapy through a group. 
the assumption is that the motivation for 
change is contained within the emotional re- 
lationship established among the members of 
the group. In this method, the focus of treat- 
ment is on the group. ° 

There have been no controlled investiga- 
tions of the relative merits of these two ori- 
entations to group psychotherapy with chronic 
schizophrenic patients, although qualitative 
observations on their use have been reported 
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318 Ralph G. Semon and Norman Goldstein 
Table 1 
Description of Groups on Basis of Matching Variables 
HAS * Total HAS Subscale Iè Total 
Score Score Hospitalization 2 
Groups (pre-therapy) (pre-therapy) (in years) Ag 
Experimental 
I Mean 44.7 41.7 13 io 
Range 12.2-84.6 12.5-88.2 8-22 29. 
I Mean 45.9 42.5 13 37 
Range 28.0-72.4 16.7-65.2 6-24 33-48 
II Mean 45.1 48.1 13 a 
Range 16.3-83.1 17.4-73,3 5-19 29-4 
IV Mean 44.6 42.9 14 37 
Range 14.9-89.2 13.0-91.7 5-23 26-45 
Control 
V Mean 


48.3 50.2 14 


38 
Range 14.3-81.6 11.5-84.6 3-21 25-45 
* Hospital Adjustment Scale. 
b Communication and Interpersonal Relations, 


(13). Frank also discussed two different phi- 


losophies of treatment in group Psychotherapy will be b 
. . tt 
with such patients. After describing the work op nate ati 


F å ch 
cient evidence to indicate that one appro4 r 


of exponents of these two methods, Frank Method 
concluded: “Our experiences allow no deci- a 


sion as to whether the method of working pri- Patient Population 


marily with the gro 


with the individual patients, is better” (3, p. 


229). 
In view of the inc 


chotherapy in the treatment of chronic schizo- 


phrenic patients, an 
controlled studies in 
ductive to investigat 


rer e 
“P, or working primarily The experimental design called for n 


matched &roups of chronic schizophrenic Ey, 

f tients, four experimental and one © 5 

reasing use of group psy- The jy for each of the experimental gfOUP 
of Was eight; the N for the control grouP his 

qhe limited number of Seyen. The patients who participated in e of 

this area, it seemed pro- study were from the male chronic servic 

e the following Problems; 


the Boston State Hospital. To maximize 
(a) the value of group Psychotherapy for Mogeneity of the i hee the patients Wea 
hospitalized chronic schizophrenic patients re 


and (b) the relative 


ori: 
7 selected on th i wing Criter o 
effectiveness of different e basis of the following 


(a) they must all be from the diagnostic 2t% 
methods of treatment. The hypotheses were gory of schizophrenia; (b) ae must ait 
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1. Groups that receive Psychotherapy wil] remission: (c) they must be within the hes 
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the relative therapeutic effectiveness of two 
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The patients were then assigned A on the 
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following variables: (a) over-all adju 
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in the hospital; (b) interpersonal function- 
ing; (c) length of total hospitalization; and 
(d) age. The measure of over-all hospital 
adjustment was the total score (pretherapy) 
on the Palo Alto Hospital Adjustment Scale 
(HAS), while interpersonal functioning was 
measured by the score (pretherapy) on sub- 
scale I of the HAS. This scale will be more 
fully discussed below. 

The matching on the variables of age and 
length of hospitalization was tested by non- 
parametric procedures for equal and unequal 
groups (7, 15). The ¢ test for the significance 
of the difference between uncorrelated means 
was applied to test the matching on the HAS 
total score and the subscale I score. No sig- 
nificant differences were found. Table 1 de- 
scribes the groups with respect to the match- 
ing variables. 


Groups 


The five groups were randomly designated 
as the control and experimental groups. Each 
of the four experimental groups had 50 hours 
of group therapy, meeting for daily sessions 
of one hour, five days a week for ten weeks. 
The mean attendance throughout the treat- 
ment period was 7.6. With the exception of 
two meetings the number of patients present 
was at least six. The control patients did not 
meet as a group, but were evaluated before 
and after the treatment period along with the 
experimental patients. All the patients con- 
tinued to receive the standard custodial treat- 
ment on the wards. 

The two therapeutic techniques used in 
this study were designated Active-Participant 
(AP) and Active-Interpretive (AI). The for- 
mer is comparable to the group-centered 
method, while the latter is comparable to the 
leader-centered method. The term active is ap- 
plied to both methods, since activity on the 
part of the therapist is seen as a necessary 
characteristic of work with psychotic patients. 
The methods were characterized by prescribed 
differences in the role of the leader. Of the 
four experimental groups, two were randomly 
selected as AP groups and two as AI. The au- 
thors alternated as leader and observer so that 
each was the therapist in two groups, assum- 
ing the AP role with one group and the AI 
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role with the other throughout the treatment 
period. 


Leader Role 


The two leader roles were defined as fol- 
lows: 

a. In the Active-Participant role, the aim 
of the leader was to promote interaction. To 
this end he functioned as a quasi-member of 
the group. His behavior was directed primarily 
toward the stimulation of group activity and 
the encouragement of participation on the part 
of each member. With this purpose in mind, 
the leader played a relatively inactive role in 
the determination of group issues, encouraged 
member-to-member interaction, promoted an 
attitude of mutual support and sharing of ex- 
periences and feelings, and minimized his in- 
vestigation of personality dynamics. 

b. In the Active-Interpretive role, the aim 
of the leader was to emphasize investigation 
and interpretation with a view to promoting 
understanding of underlying motivations. To 
this end he analyzed the feelings and attitudes 
of group members, and communicated to them 
his understanding of the dynamics. The leader 
was the central figure in the organization of 
the group. Thus, he played a relatively active 
role in the determination of group issues, 
clarified issues for the purpose of encouraging 
further investigation, investigated and inter- 
preted the motivations for member behavior, 
and focused on individual understanding of 
feelings and attitudes. 


Validation of Leader Role 


The degree to which both leaders were suc- 
cessful in assuming the two different roles was 
determined by having two resident psychia- 
trists, each with a minimum of a year of group 
psychotherapeutic experience, judge leader 
roles from a number of tape recordings of 
group sessions. 

Twenty-four time samples of ten minutes 
each from eight group meetings were used for 
this validation procedure. The eight meetings 
were selected at random from those sessions 
which were later to be used in the analysis of 
the data. The ten minute selections were taken 
from the early, middle, and late portions of 
each of the eight meetings. The time samples 
were equally divided between the AP and AI 
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Table 2 


i iliti igni f the Difference 
= tth , t Ratios, and Probabilities for the Significance o! 
ini: EE obey eT on the Palo Alto Hospital Adjustment Scale (HAS) 


Hospital Adjustment Scale 


Subscales 
Total Subscale Subscale Subscale I eui 
Groups Score Is IIe IIIe Com 
Experimental (combined) 
Means Pre 45.1 43.8 48.5 43.0 sr 
Post 49.5 48.4 56.7 44.7 E) 
t 1.60 1.81 1.56 36 nee 
f <10 <.05 <.10 >.35 <I 
Control 4 
Means Pre 48.3 50.2 53.8 40.2 ree 
Post 48.5 52.2 47.2 42.6 ad 
t :03 33 —.81 -26 90 
p >90 >.70 >.40 >.80 At 


* Communication and Interpersonal Relati S. 
» Care of Self and Social Responsibility, 07 
° Work, Activities, and Recreation, 


sessions, and the order of Presentation was 


jcatio® 
g eens rately. Subscale I measures “Communic 
randomized each time in an effort to mini- and Interpersonal Relations”; Subscale sit 
mize any patterning effect, deals with “Care of Self and Social Resp?” 


Both judges Correctly identified the role 
which the leader assumed in twenty-two of the 
samples (92%). For the other two samples, 
one judge identified the leader’s intent cor- 
rectly, while the other judge disagreed. Thus, 
96% of the judgments confirmed the leaders? 
interpretations of role. 


bility”; Subscale IIT rates “Work, Activin 
and Recreation,” Ratings are made by ma 
ward attendant, and a score is obtained W nt 
gives a quantitative estimate of the patie 
adjustment jn the hospital; the higher 
Score, the better the adjustment. 


Leaders Results 


C re- 
Hypothesis I was tested by comparing Phe 


The two leaders in this stu therapy scores with posttherapy scores 0” 


dy were the au- 
thors of this paper. They were clin 


: ical Psy- HAS for th i imental group 
chologists who had each worked at least two also for the Suet te statistic we 
years in a mental hospital. Each had a mini- pone g 


mum of one year of group Psychotherapeutic 
experience with psychotic Patients. In addi- 
tion, each leader had personal experience as a 
member of a training group. 


Measures 


The measure of therapeutic effectiveness 
used was the Palo Alto Hospital Adjustment 
Scale (HAS). This is a rating scale designed 
to evaluate patients’ behavior in a Psychiatric 
hospital, with emphasis on interpersonal be- 
havior (8). The scale is divided into three 
Subscales, each of which can be scored sepa- 


Was the ¢ ratio for the significance of tH" he 
ference between correlated means. 5 ot ine 
Prediction for the control group did MO" est 
volve a directional trend, the sere 
of significance was used. The predic rouP 
Positive changes in the experimental f 
allowed the use of a one-sided test. made 

Table 2 shows that the control ee rat 
no significant improvement on the 
ings. This was as predicted. The ? ee pre 
the difference between the means oine r 
and Posttherapy scores for the com was a9 
perimental groups was < .10. This trend; 
cepted as suggestive of an importa 
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but not as sufficient basis for confirmation of 
the hypothesis that group therapy makes a 
difference with respect to clinical improve- 
ment. 

On closer inspection of the results, how- 
ever, it became evident that the behaviors 
measured by Subscale I and Subscale II were 
most directly affected. Subscale IIT made no 
significant contribution to the over-all picture 
of improvement (Table 2). As was previously 
noted, Subscale III evaluates changes in the 
areas of Work, Activities, and Recreation. In 
retrospect, it seemed unreasonable to expect 
changes on this part of the HAS, as the op- 
portunities in these areas were relatively static 
and limited on the wards from which most of 
the patients came. 

Subscale III’ was therefore excluded, and 
the analysis was repeated using an HAS score 
based on Subscales I and II combined. The 
results (Table 2) show that the experimental 
groups, by this measure, manifested improve- 
ment at the .05 level of significance. Pre- and 
posttherapy comparisons for the control group 
showed no significant change. The findings on 
‘the HAS for Subscales I and II combined con- 
firm the hypothesis that group therapy will 
effect significant clinical changes in chronic 
schizophrenic patients. 

The experimental and control groups can 
also be compared in terms of the overlapping 
of their gains on the HAS. Three of the seven 
patients in the control group, 43%, showed 
more gain than the average experimental pa- 
tient. In the experimental groups, 18 of 32 
patients, 56%, showed more gain than the 
average control patient. These percentages 
show the existence of real differences, but em- 
phasize their small magnitude. 

Hypothesis 2 was tested by comparing the 
mean difference between the pre- and post- 
therapy HAS ratings for the combined AP 
groups with the mean difference between the 
pre- and posttherapy HAS ratings for the 
combined AI groups. The statistic used was 
the ż ratio for the significance of the differ- 
ence between uncorrelated means. A two-sided 
test of significance was utilized. The analysis 
was repeated for the scores based on Sub- 
scales I and II combined. No significant dif- 
ferences were found. The results of the sta- 
tistical treatment of the data did not permit 
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rejection of the hypothesis that there would 
be no significant difference in the therapeutic 
effectiveness of the two methods. 

When the data were analyzed separately for 
the groups, the findings suggested that there 
were additional factors operating to influence 
the results. There seemed to be some sort 
of interaction between an approach and the 
characteristics of the person assuming it. An 
analysis of covariance, however, showed no 
significant interaction. 


Discussion 


The finding that group therapy effects sig- 
nificant improvement in the interpersonal 
functioning of chronic schizophrenic patients 
is consistent with the qualitative observations 
of other workers, and similar to the results 
obtained by Sacks and Berger (11). 

This result is of interest since disturbances 
in interpersonal relations are considered as 
central to the pathology of schizophrenia. 
The goal of group psychotherapeutic endeavor 
with schizophrenic patients has been seen as 
one of social rehabilitation. Semrad, in sum- 
ming up the experiences of the staff at the 
Boston State Hospital, stated that “we felt 
the patients attain from group therapy a 
social rehabilitation rather than a definite 
change in their personality trends” (12, p. 
235). Gurri and Chasen observed “that some- 
times they re-learn the art of social inter- 
course without even losing their delusional 
trends” (5, p. 52). In groups, patients may 
gradually modify previously established, in- 
effective social attitudes and techniques and 
in this way be better able both to deal with 
others and to assume greater responsibility for 
their own needs. The positive results from 
group therapy in this study support observa- 
tions along these lines. 

The therapeutic goals in the present study 
were limited in the sense that interpersonal 
adjustment within the hospital setting was 
the criterion for improvement. With chronic 
schizophrenic patients, this was seen as a 
necessary first step. While it was demon- 
strated that these people respond in a posi- 
tive manner to such influences, it remains to 
be shown that through group therapy chroni- 
cally sick mental patients can gain sufficient 
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personality organization and drive to re-estab- 
lish social ties outside the hospital. 


Summary 


The effectiveness of group psychotherapy 
with chronic schizophrenic patients and the 
relative merits of two different therapeutic 
methods were evaluated. Thirty-nine patients 
were selected and assigned to five matched 
groups, four experimental and one control. 
The two methods of group therapy were desig- 
nated Active-Participant and Active-Interpre- 
tive, and were characterized by contrasting 
styles of leadership. The experimental groups 
each had 50 hours of therapy. The measure of 
therapeutic effectiveness used was the Palo 
Alto Hospital Adjustment Scale. 

The results permitted the conclusion that 
chronic schizophrenic patients improve in 
group therapy with respect to interpersonal 
functioning. Differences in the relative merits 


of the two methods of group therapy were not 
demonstrated. 
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The Need for Representative Design in Studies 
of Interpersonal Perception 


Wayman J. Crow 


Behavior Research Laboratory, University of Colorado 


Brunswik (3, 4, 5) has pointed out that a 
functionally oriented psychology requires a re- 
search design of its own. Using examples taken 
primarily from research on the perception of 
physical size, he has demonstrated the re- 
stricted nature of classical systematic design 
and the necessity for representative design. 
The purpose of this paper is to illustrate the 
implications of representative design as ap- 
plied to research on interpersonal perception.* 

In the typical interpersonal perception study, 
a group of subjects (Ss) is asked to observe 
another person (object). From this observa- 
tion, the Ss are required to estimate how that 
person (object) would respond to some pro- 
cedure. The Ss’ accuracy can be established 
by comparing their estimations with the actual 
performance of the object-person. In the usual 
study, the Ss have been sampled from some 
defined population, and results can be general- 
ized to the parent population by well-known 
statistical procedures. Not generally heeded, 
however, is Brunswik’s admonition that the 
same requirements of statistical logic apply to 
the objects as well as the subjects of psycho- 
logical studies. 

If randomly-selected Ss are asked to esti- 
mate the responses of a number of randomly- 
selected object-persons to a personality ques- 
tionnaire, results may be generalized to both 
parent populations. The precision of the esti- 
mate of the population parameter is a func- 
tion of both the standard deviation of the 
sample and the inverse of the sample size, or 
N. Obviously, generalization to a population 
of Ss would not be risked if the sample of Ss 

1 Hammond has pointed out examples of failure to 


consider object sampling (and its consequences) in 
Opinion polling (11) and clinical psychology (12). 


was known to be unrepresentative of that 
population, or if only one representative from 
that population was provided in the study. If 
the representativeness of the sample cannot be 
established, then the use of statistical infer- 
ence is invalid. If only one representative of 
the population is included in the sample, then 
of course there is no way to estimate the 
sampling error, and therefore it is impossible 
to estimate what the results would have been 
if a different S had been used. This logic ap- 
plies with equal force to the person-objects 
used in experiments. It is especially important 
for the problem of interpersonal perception 

for it must be granted that variability in per- 
sonality traits exists whether a person is per- 
ceiver or perceived, i.e., whether he is subject 
or object. It is mandatory, therefore, either to 
use an adequate sample of objects randomly 
selected from the population of objects to 
which we wish to generalize, or to forego 
generalization. 3 $ 


But to forego generalization to object popu- 
lations leads to research of limited usefulness. 
Much of current research is focused upon in- 
terpersonal perceptiveness as a general ability. 
For example, Kelly and Fiske (15) used an 
interpersonal perception measure as a cri- 
terion of diagnostic competence for clinical 
psychologists, and Gage and Suci (11) have 
investigated the relationship of interpersonal 
perception and teaching ability. The knowl- 
edge that a subject is accurate in estimating 
a single object-person’s responses may be use- 
ful under special circumstances, but much 
more useful would be a measure that informs 
us about a S’s accuracy in general. If gener- 
alization is foregone, however, all that remains 
is the knowledge that a clinician accurately 
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estimated a particular patient’s responses; 
there is no way to determine how accurately 
he would estimate responses of “patients n 
general.” And, for the study of diagnostic 
competence, this is precisely the information 
needed. 

It is the contention of this paper that, at 
the present time, a “double standard” with 
regard to subject and object sampling in in- 
terpersonal perception research is the rule 
rather than the exception. For example, Luft 
(16) asked clinicians and nonclinicians to 
estimate how a patient would respond to a 
personality questionnaire. He concludes, “The 
results suggest that there is no direct rela- 
tionship between clinical training and the 
ability to predict verbal behavior of an indi- 
vidual” (16, p. 758). Luft does not caution 
his reader that in each of his two experiments 
he requires his subjects to make estimations 
for only one patient, i.e., one object. Luft 
does show concern for subject sampling (and 
thereby demonstrates the double standard) 
when he states, “What was surprising was the 
poor showing of the psychodiagnostician who 
had much more material on which to base his 
impressions than the other judges. Of course, 
his was only a single instance and it would 
be unfair to generalize” (emphasis added) 
(16, p. 757). 

In theoretically oriented studies, where gen- 
eralized ability is not the primary focus of at- 
tention, unrepresentative selection of objects 
still leads to conclusions of limited usefulness. 
Research concerned with the isolation, ma- 
nipulation, or interrelation of important varj- 
ables in interpersonal perception can be af- 
fected by “accidents” in the selection of 
objects as well as subjects. Conventionally, 
uncontrolled independent variables are as- 
sumed to have random effects upon the de- 
pendent variables of an experiment. The ex- 
perimenter attempts to insure the randomness 
of the effects of uncontrolled independent 
variables associated with the subjects by se- 
lecting the subjects at random from a defined 
Population. Uncontrolled independent vari- 
ables associated with the objects, however, 
also affect the results; therefore the experi- 
menter should use as much care in the selec- 
tion of objects as he does with subjects (cf. 
12, p. 155), Otherwise, the results are affected 
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by accidents of selection in an unknown way 
and may become nonreproducible. 

An example is furnished by a study by 
Gage (9), who used four objects selected from 
students in an educational psychology class, 
plus two objects who were female clerical em- 
ployees. 'He pointed out that the latter “. . . 
were thus not members of the same ‘cultural 
subgroup,’ a fact which may help us under- 
stand some of the results” (9, p. 7). Gage 
found that the correlations between the Ss’ 
odd and even predictions for these two objects 
were negative, although the Ss’ odd-even cor- 
relations for the other four objects were posi- 
tive. Gage’s results were vitally affected by 
uncontrolled independent variables associated 
with the objects used in his study. Similar 
contradictions in results should be anticipated 
by experimenters whenever they fail to aban- 


don the “double standard” and do not use the 
same well-know; 


selection of thej 


8 Ways: (a) either an in- 
objects was used, (b) the 
ich the objects were se- 
d, or (c) the procedure 


2, 6, 7, 8, 9, 10, 14, 15, 16, 17, 18, 19, 20). 
A “The case in which 
ubjects but also the 


ead the experimenter to ignore 
representative design is intrinsic 
of interpersonal perception. 
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A Note on the Clinical Validity of the Matsh-Hilliard- 
Liechti MMPI Sexual Deviation Scale 


William C. Holz, George F. Harding, and Sidne 


y M. Glassman 


TS. Army 


In a previous edition of this Journal, Marsh 
et al. (1) proposed a sexual deviate (Sd) scale 
for the MMPI. This scale was devised to dis- 
tinguish incarcerated sexual criminals from 
normal individuals; in addition, the authors 
suggested that it may also find more general 
application in the clinical diagnosis and treat- 
ment of sexual pathology. Peek and Storms 
(3), who questioned the latter generalization, 
tested a small sample from a state hospital 
population, and concluded that the scale was 
primarily a measure of general abnormality. 
The present authors have sought additional 
cross validation by applying the scale to a 
more homogeneous population than is found 
in the previous studies, 


Table 1 
Comparison of Group Means by £ Test 


L-H C-D Non-P 
S-D 1.46* .59* 2.90** 
L-H 1.60* 7.95** 
C-D 6.99*+ 


* Significant at .10 level. 


** Significant at .01 level, 


The MMPI was administered to 105 Army 
enlisted men, ranging in age from 19 to 31 
years. Included in this group were four classi- 
fications (2): (a) Sexual Deviate (S-D), 21 
known, admitted deviates who were in proc- 
ess of discharge from the service and who had 
been examined by a psychiatrist to assure that 

e deviant sexual behavior was not symp- 
tomatic of a neurosis or psychosis; (b) La- 
12 Ss whose be- 
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excluding those who were in the “Sexual De- 
viate” subclassification; and (d) Non-Psychi- 
atric (Non-P), 30 Ss selected at random from 
an Army unit. 


Analysis of the data reveals that the stand- 


t f djusted groups is equally well 
differentiated from the Non-P’s. The correla- 


a Scale with four other scales 


he authors? Conclusions are essentially the 
Same as those of Peek and Storms (3), that 
the scale measures generalized psychiatric 
maladjustment, The findings of Marsh et al. 

apparently resulted from the gross dif- 
ferences between their Comparison groups. 
This Study indicates that their scale cannot 
be used to distinguish between sexual deviates 
and other maladjusted groups. 
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Rehabilitation of persons suffering from 
chronic physical illnesses attempts to restore 
these persons to levels of functioning as close 
as possible to their former states, within the 
limits of their disabilities, Usually this kind 
of rehabilitation includes physical and occupa- 
p tional therapy as well as the necessary medi- 
cal and surgics] services. 

The physical aspects of the rehabilitation 
Process are fairly well understood and may 
best be left to the medical profession to ex- 
plain. Our understanding of psychological fac- 
tors, however, is in a relative state of specu- 
lation and conjecture, perhaps because the 
psychologist’s past contribution has consisted 
largely of empirical studies, many of which 
are probably of a doubtful value from a 

methodological point of view (2), and “clini- 

«cal impressions,” which lack the refinement 

= _of controlled investigation (3, 4, 5). An as- 
pect of rehabilitation which the present au- 
thors feel to be of primary psychological 
> importance is the communicative relationship 
à between the patient and the staff working 
with him. The importance of communicative 

i sensitivity has been emphasized by Barker 
} and Wright who Suggest, “The worker will be 
| most effective when he is sensitive to the clues 
given by the client as to the course their rela- 


à tionship should take,” and, “Being sensitive 
f to the client means that the worker must take 
ad into account the emotional meaning of the 
disability . . .” (3, p. 23). 
ay 1 Credit is also due Messrs, Don K. Worden and 
aj 


Robert Postel, who served as assistants on this proj- 
x ect, and to the Cleveland Foundation which Provided 
n the necessary funds. i 


The Significance of Patient-Sta£ Rapport in the 
Rehabilitation of Individuals with Chronic 
Physical Illness? 


F. C. Shontz and S. L. Fink 


Highland View Hospital 


tients. The Program (now in its ex? perimen- 
tal stage) provides for more freque,*,} ther- 


par- 


ticipation or non-participation in an intensive 
» 


treatment program. 
The semantic differential (A: 

W: è 

lected as the best a 

cative rapport, since i 
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who are on an intensive treatment program 
have closer communicative rapport with (a) 
occupational therapists, and (b) physical 
therapists, than do patients not on an inten- 
sive treatment program. 


As a corollary: 


I-a. Semantic concepts relevant to patients’ 
current situations possess different semantic 
values for patients on an intensive treatment 
program than for patients not on an intensive 
treatment program. 

II. Patients who are judged by therapists 
to have shown great improvement in occupa- 
tional and physical therapy have closer com- 
municative rapport with these therapists than 
do patients who are judged to have shown 
little improvement in these areas. 


As a corollary: 


Ifa. Semantic concepts relevant to pa- 
_tients’\ current situations possess different 
semantic values for patients judged to have 
shown great improvement in occupational and 
physical therapy than for patients judged to 
have shown little improvement in these areas. 
Supplementary investigations. In addition 

to the tests of the major hypotheses, the pres- 
ent study concerned itself with investigating 
the effects of the variables “age,” “sex,” and 


“length of hospital stay,” upon communica- 
tive rapport. Each varia 
determine wh 


at different levels of a 


Pport with physi- 
Sts; and, finally, 
etermine whether 


cal or occupational therapi 
tests were conducted to d 


Method 
Measurement 


The semantic diff i 
erential employed was 
made up of twelve concepts rated by all Ss 
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on each of twelve associative scales, a total of 
144 ratings per administration. Seven con- 
cepts were selected to tap associations of cur- 
rent situational significance to the patient 
population: Occupational Therapy; Physical 
Therapy; Rehabilitation; Independence; Bed; 
Surgery; and Psychology. Five concepts were 
selected to represent associative areas the im- 
portance of which was felt to be relatively 
independent of hospital life as such: Life; 
Dream; Fate; Future; and Old Age. The 
scales were constructed to represent the three 
dimensions: Evaluative, Potency, and Activ- 
ity. Each scale was characterized by the 
names assigned to the two extreme ends of 
its continuum. The four Evaluative scales 


were: Good-Bad; High-Low; Happy-Sad; 
and Beautiful-Ugly. The four Potency scales 
were: Tough-Te~der; Heavy-Light; Full- 


Empty; and Sharp-Dull. The 


four Activity 
scales were: Fast- 


Slow; Jumpy-Calm; Fero- 
and Tense-Relaxed. Each S 


Teement with the first named 
continuum (Good, Happy, 


C agreement with the second 
e (Bad, Sad, Tender, Dull, 
from two to six were used for 


middle Category. 


The instrument i 
Provided two types of meas- 
ures: (a) the index D, corres = 
Index of Semani- 


j Administration, Because of the physical 
ondition of many of the patients, the seman- 


etc.), a value of seven indi- 


mi 


aa 
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tic differential was administered orally and in- 
dividually, with the examiner recording each 
S’s responses. Randomness of stimulus pres- 
entation was obtained by listing each concept 
and scale on separate 3” x 5” cards, which 
the examiner shuffled before each administra- 
tion. A Concept card was selected and placed 
before the S where he could refer to it as de- 
sired. Next, the scale cards were randomly 
presented by the examiner while the S made 
appropriate semantic judgments. Scale cards 
were reshuffled between each concept presenta- 
tion until the S had rated all concepts on all 
scales. 

The instructions included a brief descrip- 
tion of the procedure, an explanation that 
this was a research to “find out how people 
feel about hospital life,’ and directions to 
provide numerical association values accord- 
ing to the “meaning the words have for you.” 
Clarification was provided as necessary, but 
most questions were answered with an indi- 


rect “whatever you think,” or “any way you 
like.” 


Reliability. Reliability was established in 
‘an earlier study which demonstrated that 
variations between ratings assigned by the 
same individual on successive administrations 
of the semantic differential were significantly 
smaller ( less than .001) than variations be- 
tween ratings assigned by individuals paired 
at random. This represents evidence of suffi- 
cient reliability for purposes of the present 
research, 


Sub jects 


The Ss were selected from the patient and 
Professional population of Highland View 
Hospital, Cleveland, Ohio. Twenty-seven pa- 
tients, nine occupational therapists, and seven 
physical therapists participated. Patients were 
not selected on the basis of their affliction 
with any particular physical illness, although 
‘there was reason to believe that all groups 
were Closely comparable with regard to “de- 
gree of disability” as such. Patients with 
known organic brain pathology were specif- 
ically excluded. 

Of the 27 patients, 24 served in the in- 
vestigation for possible age, sex, and hospitali- 
zation differences. It was convenient to dis- 
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tribute these subjects into a 2 x 2 X 2 fac- 
torial design, with three Ss in every cell. The 
criteria for dichotomizing the variables were: 


1. The younger group included 12 Patients under 
30 years of age; the older group included 12 patients 
over thirty years of age. (Mean ages: 22.5 years and 
40.7 years; average deviations: 2.3 and 6.6 years, 
respectively.) 

2. Twelve males and 12 females were included. 

3. The sample was divided according to whether 
the individual had been in the hospital for more or 
less than one year. (Mean number of months: 26,2 
and 4.7; average deviations: 7.8 and 2.7 months, re- 
spectively.) There were 12 Patients in each group. 

Therefore, of the 24 individuals, three were males 
under 30 years of age who had been in the hospital 
less than one year. Three were males under 30 years 
of age who had been in the hospital more than one 
year. There were two similar groups of females; and 
a similar pattern was repeated for Ss over 30 years 
of age, making a total of eight cells containing three 
Ss each. The balance of the Patients was used when 
necessary and appropriate in the tests of the three 
major propositions and their corollaries, 


It was originally planned to match each 
patient-subject with his own physical and 
occupational therapist, but staff assignment 
rotations, personnel turnover, and other con- 
siderations mitigated against such a pro- 
cedure. The use of “composite therapists” 
seemed feasible, however, since, on an em- 
Pirical basis, all therapists were found to show 
high interpersonal semantic agreements (com- 
municative rapport) within their own depart- 
ments. So far as the items on the present 
scale are concerned, the associations expressed 
by any one therapist were generally good pre- 
dictors of the associations expressed by them 
all, interindividual variability being far more 
characteristic of the patients than of the staff. 
The nine occupational therapists were there- 
fore used in the compilation of the Median 
OT, a composite derived by taking the median 
associative value assigned by the therapists to 
each of the 144 ratings on the semantic dif- 
ferential. Patients were compared to the com- 


e compilation of a 
rved as a standard 
semantic similarity 
apist as required, 
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Procedure 


The nonparametric Mann-Whitney U test 
described by Moses (6) and Auble (1), was 
the statistical technique employed. The U test 
is applicable when samples are of equal or un- 
equal size and implies no assumptions about 
parameter distributions. 


Preliminary investigations. The tests of 
variables, other than intensive treatment, 
which might affect a patient’s semantic asso- 
ciations utilized the 2 x 2 x 2 factorial de- 
sign. Within this framework, 54 pairs of Ss 
were selected so that, for each variable, 18 
pairs represented homogeneous matchings at 
one extreme of the variable (e.g., old with 
old); and 18 pairs represented homogeneous 
matchings at the other extreme (young with 
young). The remaining 18 pairs were cross- 
matchings on the variable in question (old 
with young). The null hypothesis, that D 
Scores from the homogeneous matchings and 
from the Cross-matchings were samples from 
a common population, was evaluated with one- 
tailed tests of significance. 

Two-tailed tests were used to evaluate the 
null hypothesis that D scores, between pa- 
tients and therapists, would be samples from 
a common population when tested in terms of 


a Patient’s sex or Jevel of age and hospital 
stay, 


, The final test ri 
tient’s directly e 
tions. Th 
hypothes 
groups o. 
atically 


equired the use of the pa- 
xpressed semantic associa- 
ese were tested to evaluate the null 
is that semantic values expressed by 
f patients, selected to vary system- 
with respect to sex, age, and duration 
of hospital stay, represented samples from a 


common universe, Two-tailed tests were made 
on each variable. 


Propositions and corollaries. The major re- 
search propositions required statistical evalua- 
tion of series of D scores obtained through the 
comparison of appropriate patient and thera- 
Pist subject-pairs. The corollaries involved the 
comparison of directly expressed semantic as- 
Sociations, without regard to interindividual 
Similarities. Since directional predictions were 
Made in the Propositions, one-tailed tests were 
made of the necessary null hypotheses; and 
since no directional predictions were made in 
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the corollaries, two-tailed tests were employed 
for these evaluations. 

Ratings of progress in physical and occu- 
pational therapy were required for the tests of 
proposition II. In occupational therapy, these 
ratings were obtained on 22 patients. The Ss 
were rated by their current therapists on two 
aspects of occupational therapy: Functional 
Capacities and Activities of Daily Living. 
Ratings were summed for all patients, who 
were then divided into two groups of 10 each; 
those who were judged to have shown rela- 
tively great improvement, and those who were 
judged to have shown relatively little improve- 
ment. Two Ss had shown only slight over-all 


improvement, and these Ss were not used since 
they seemed to be adeq 


up of 14 Ss and a 
up of 12 Ss, 
Results 
Preliminary Investigations 
In terms of D SCO 


Mean uae Assigned by Patients with Different Levels 
o °spitalization to the Scales on Which 
Significant Differences Were Found 


Hospitalization time 
————— EN 
Sa More 
an th: 
Scale 1yr. 1 = > 
Evaluative 
Good-Bad = = <.025 
Happy-Sad a $a <.025 
Beautiful-Ugly PH 3+ <.025 
Sharp-Dull Bul 3+ <.025 
Tough-Tender an 4— <.025 
eavy-Light ae a <.025 
umpy-Calm oR 4— <.05 
Full-Empty a a <.05 


<.05 


Resu 
cates this nume rounded | to nearest whole number. (+) indi- 


Ë 


length of time in the hospital. The data pro- 
vided no evidence that the variables age, sex, 
or length of time in the hospital significantly 
affected the communicative rapport (meas- 
ured by D scores) between patients and the 
Median OT and Median PT composites. 

The directly expressed semantic values of 
younger patients were not significantly dif- 
ferent from the directly expressed semantic 
values of older patients; nor were the directly 
expressed semantic values of male patients sig- 
nificantly different from those of female pa- 
tients. Significant differences were found be- 
tween the directly expressed semantic values 
of patients in the hospital less than one year 
and the directly expressed semantic values of 
patients in the hospital more than one year 
(see Table 1). 


Table 2 


Levels of Significance of Differences in Semantic 
Similarity Between Patients and Therapists, 
for Patients Receiving and Patients Not 
Receiving Intensive Treatment 


Fortestsof For tests of 
D scores D scores 
between between 
patientsand patients and 
occupational physical 
therapists therapists 
Measure p $ 
Total semantic differential <.10 <.04 
Scale group Evaluative chance chance 
Scale group Potency <.05 <.01 
Scale group Activity <.10 chance 
Scales 
High-Low <.05 <.10 
Tough-Tender <.025 <.01 
Heavy-Light <.10 <.10 
Full-Empty chance <.10 
Ferocious-Peaceful <.025 <.025 
Tense-Relaxed <.05 chance 
Concepts 
I3 Fate <.01 <.10 
f Occupational Therapy chance <.10 
Physical Therapy <.05 <.10 
Psychology <.02 <.10 
Rehabilitation <.10 <.025 
Surgery <.10 <.10 


Note.—One-tailed tests. All differences are in the direction 
9f patients receiving intensive treatment showing greater simi- 
larity (smaller D) with therapists than patients not receiving 

tensive treatment. 
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Table 3 


Levels of Significance of Differences in Semantic 
Similarity Between Patients and Therapists, 
for Patients Regarded as Showing 
Great and Little Improvement 


For tests of D scores 
between patients and 
occupational therapist 


Measure ' ? 
Total semantic differential <.01 
Scale-group Potency <.01 
Scale-group Activity <.01 

Scales 

Beautiful-Ugly <.025 
Tough-Tender <.01 
Full-Empty <.025 
Fast-Slow <.01 
Jumpy-Calm <.025 
Ferocious-Peaceful <.01 

Concepts 
Dream <.025 
Fate <.05 
Future <.01 
Life <.01 
Occupational Therapy <.01 


Note.—One-tailed tests, All differences are i 
of patients showing great improvement h 
larity (smaller D) with therapists than p; 
mprovement. 


> in the direction 
having greater simi- 
atients showing little 


xtreme ratings 
subject groups, 
P showed a sig- 


nificantly greater use of 1's on the Good-Bad, 


year” 
cantly higher frequencies of 1’s or 7’s on any 
of the scales. 

Propositions and Corollaries 


Significant results. ob 
cal 


relationship between patient-therapist com- 
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municative rapport and rated progress in 
occupational therapy, but not with respect 
to the relationship between patient-therapist 
communicative rapport and rated progress in 
physical therapy. Corollaries I-a and I-a 
were not considered to have been verified by 
the present investigation. 
Discussion 

The fact that males and females did not 
form homogeneous semantic groups in our 
study may be accounted for in either of two 
ways. The instrument itself may not be sensi- 
tive to such differences, or the size of the 
samples may have been too small for signifi- 
cant differences to appear. The same reason- 
ing may be used to explain our results on the 
age variable, but in this case there also could 
have been too small an age difference between 
the groups to have resulted in significant dif- 
ferences on the test. Both variables would be 
worth investigating in future research, since 
no definite conclusions may be drawn from 
the results obtained here. 

For the variable “length of time spent in the 
hospital,” the results were much more defini- 
tive than for age and sex. The results clearly 
indicated that the length of time patients 
spend in this institution has some measurable 
effect upon their feelings as reflected in the 
semantic differential. It seems that at least 
those aspects of their lives that are repre- 
sented in the test tend to change in meaning 
for them as they remain longer in the hospital 
setting. Apparently what happens is that 
“things in general” become less Good, less 
Happy, less Beautiful, less Full, less Sharp 
less Tender, less Light, and less Calm. One 
finds what seems to be a growing indifference 
in attitude, and only future research can tell 
whether an actual reversal of feelings eventu- 
ally could occur. Patients in most hospitals 
have a minimum of contact with the more 
stimulating experiences healthy people en- 
counter. Their lives are routinized to such an 
extent that one day or week is thi 
the next. Much of their lives is }j 
tasy, and reality no longer offers 
tional satisfaction. There is good reason to 
suspect that the patient’s level of Motivation 
tends to decline with the passag 


2 eer e of time i 
an institution, a problem which deserves fn 


e same as 
ved in fan- 
much emo- 
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ture investigation in relation to the findings 
of the present research. 

Goals in rehabilitation is another factor 
which we feel to be of prime importance. The 
patient’s own initial expectations may tend to 
go far beyond what is realistic in terms of the 
nature and extent of his disability. Under such 
circumstances it could be expected that his 


attitudes, feelings, and motivation would show + 


changes as he finds himself unable to reach 

his goal after months or years of effort. 
The verification of proposition I in the 

present investigation demonstrates the exist- 


ence of a measurable relationship between in- -^ 


tensive treatment and the communicative rap- 
port of therapist and patient. It was apparent 
that communicative rapport between patient 
and therapist was generally greater for pa- 
tients receiving intensive ‘treatment than for 
patients not receiving such Care; and this 
greater rapport existed to a significant degree 
in the patient’s relationship to both physical 
and occupational therapists. It is fairly cer- 


tain that these results are not contaminated 
by the effects of age, sex, or length of the pa- 


e hospital. It is not so cer- 


erated in the 
The results 
I suggest that co 
lated to rated succ 
but that it does 


s therapy, in 
a is a different kind 
ere, the patient i ch 
presen ; , p: is not so mu 
i ie Predetermined routines as he 
tical acti Bed to become interested in prac- 
to ire ee which he must learn in order 
in the sen © society. Communicative rapport, 
Se of agreement with respect to the 
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value and efficacy of the program, is a sine 
qua non of success in occupational therapy; 
and it therefore follows that those who are 
judged to show progress in this area also show 
stronger semantic similarity with those who 
organize their programs. 

The data also suggest the possibility that 
intensive treatment of the type investigated 
here produces slightly more agreement be- 
tween patient and physical therapist than be- 
tween patient and occupational therapist, a 
fact which is consistent with the somewhat 
greater importance ascribed at this institution 
to physical recovery than to social and emo- 
tional readjustment; but it also suggests that 
intensive care could increase its total value 
by placing greater emphasis upon practical 
functional activities and by providing a more 
nearly equal encouragement of the patient in 
less purely medical areas. Although one can- 
not yet claim actual predictive ability for the 
semantic differential, the evidence does sug- 
gest that the instrument has very promising 
possibilities for individual diagnosis in this 
area. 

It is possible at this point to suggest a 
theory which may explain the findings of the 
present research and which may prove con- 
ducive to further investigation. It is the 
opinion of the present authors that patients 
receiving intensive treatment pattern * them- 
selves closely after the models provided by 
their therapists. The chronically ill person 
shares a certain psychological similarity to 
the child: He is dependent upon others for 
his maintenance and for assistance in learn- 
ing how to cope with the world. It is com- 
monly understood that, for the child, this 
“someone else” is the parent (9, Ch. XIII). 
For the ill person, it is the doctor, the thera- 
pist, and perhaps the institution itself. A 


2 The term “patterning” may be defined as some- 
thing more than behavioral imitation and/or passive 
assimilation of factual knowledge from another per- 
son, but as something less than complete equation of 
the total-self with the total-self of another indi- 
vidual. “Patterning,” as we define it, is selective and 
more limited than the traditional meaning of the 
term “identification” would imply. While the dy- 
namic mechanism through which patterning takes 
place may be similar to that of identification, the 
scope and content of the two processes are not the 
same, 
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proper study of the patterning process as such 
would require longitudinal follow-up of pa- 
tients from the time they entered rehabilita- 
tion until after they returned to their homes. 
But the present investigation not only does 
not contradict the theoretical idea, it tends to 
suggest that this process may actually be fa- 
cilitated by intensive treatment. To the ex- 
tent, therefore, that. patterning is a necessary 
condition for maximum learning, intensive 
treatment may be considered an important 
contribution to the psychological side of re- 
habilitation itself. 

The theory may be more specifically out- 
lined as follows: 

1. Patterning of a patient according to the 
models provided by his therapists is a de- 
terminant of progress in rehabilitation. Pat- 
terning is least significant in areas which are 
formal, highly structured, and minimally de- 
pendent upon the rapport of patient and 
therapist (e.g., physical therapy). It is of 
greatest significance in areas which are in- 
formal, less highly structured, and dependent 
upon favorable personal feelings between pa- 
tient and therapist (e.g., occupational ther- 
apy, social case work, psychotherapy). 

2. Patterning is facilitated by intensive 
treatment, at least in those areas which are 
emphasized by the treatment program itself 
(e.g., “purely medical” vs. “purely practical” 
orientations). It is probably also facilitated 
by an agreement between patient and thera- 
pist which antedates actual patient-therapist 
contact. 

It would seem to be true, on the basis of 
clinical observation, that a preexisting simi- 
larity of patient-therapist attitudes plus in- 
tensive treatment does Present excellent con- 
ditions for rehabilitation success, although it 
remains for future research to determine 
which factors are most important and which 
are most easily affected by environmental fac- 
tors. It is also a legitimate problem for fu- 
ture research to determine the effect of 


communicative rapport between patient and 


therapist upon the quality of treatment pro- 
vided by the therapist. Although the present 
research demonstrates the existence of a rela- 
tionship between communicative rapport and 
perceived progress,” it in no way specifies 
whether this relationship stems from vari- 
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ability in treatment quality, from distortions 
in perception of progress on the part of the 
therapists, or from a host of other possible 
determinants which may covary with the type 
of criterion here employed. 


Summary 


The semantic differential method was used 
to examine certain cognitive aspects of the re- 
habilitation process in a hospital for people 
with chronic physical illnesses. The factors of 
age, sex, length of time in the hospital, par- 
ticipation in an intensive treatment program, 
and improvement in physical and occupational 
therapy were investigated. Each variable was 
analyzed in terms of the “semantic distance” 
between patients and occupational and physi- 
cal therapists. Within the limits of the group 
of Ss selected, and with respect to the specific 
measuring instrument, no significant differences 
were found for the variables age and sex. 
The group of patients hospitalized more than 
one year were significantly different from the 
group of patients hospitalized less than one 
year in terms of directly expressed seman- 
tic associations, perhaps because of an in- 
creased indifference on the part of the former 
group. “Semantic distance” between patient 
and therapist was found to be significantly 
reduced under conditions of intensive treat- 
ment. It was also significantly reduced for pa- 
tients who had shown “great” improvement 
in occupational therapy, although the same 
was not true with respect to progress in physi- 
cal therapy. No differences in directly ex- 


. Auble, D. Extended tables for the Mann-Whitney x 


. Barker, R. G., et al. Adjustment to physical handi- 


pressed semantic associations were found with 
respect to the variables “intensive treatment” 
and “improvement in therapy.” 


The data were explained on the basis of 


the hospital situation, the characteristics of £ 
physical and occupational therapy, and the 
type of intensive care provided. A theory of 
“patterning”? was presented to unify the find- 
ings and to suggest further research. 
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Behn-Rorschach and Rorschach under Standard and 
Stress Conditions 


Fred Schwartz and Solis L. Kates ` 


University of Massachusetts 


The present investigation is designed to 
evaluate the equivalence of the Behn and 
Rorschach tests under standard and stress 
conditions. The issue of their equivalence is 
especially relevant to the use of the Behn with 
the Rorschach in studies of stress (3). To 
date, a small number of investigations have 
indicated that the two tests are generally 
equivalent under standard test conditions (1, 
2, 7). As the Behn is said to have a more 
clearly delineated form than the Rorschach 
(7), it was hypothesized that the Behn will 
have more FSh, FK, Fe, FC’, and FC re- 
sponses, and that the Rorschach will have 
more SkF, KF, and CF responses. For the 
remaining comparisons under standard and 
stress conditions, the null hypothesis was 
tested. 


Procedure 


Four groups, each consisting of six female 
students, were tested and retested within one 
to two weeks with counterbalanced Behn and 
Rorschach blots. Both tests were administered 
according to Klopfer’s directions (5), with 
the exception that the Ss, where necessary, 
were asked to repeat the inkblot series, giy- 
ing additional responses to each card, so that 
a minimum of two responses per card for 
Cards I through IX, and four responses for 
Card X, were obtained. This Procedure was 
used to control R. 

Prior to the second test administration, one 
Rorschach group and one Behn group were 
exposed to experimental stress. The stress pro- 
cedure consisted of the identical typewritten 
Personality interpretation, supposedly based 
on each S’s first inkblot administration, which 
advised that the S was poorly adjusted (4). 


The stress employed was thus psychological 
in nature. 

Following the first administration, the Ss 
were divided into matched Pairs on the basis 
of their Rorschach Psychograms, and their 
scores derived from the Manifest Anxiety 
Scale (8). The latter was included in order to 
preselect and equate the groups on a variable 
which may be relevant to stress. One member 
of each matched pair was then randomly as- 
signed either to the experimental or contro] 
group. Following the second test administra- 
tion, all protocols were coded and scored fol- 
lowing the methods described by Klopfer (5), 


Results 


The means and standard deviations in 
Table 1 were compared by a three factorial 
(Type III) “mixed” analysis of variance de- 
sign (6). All hypotheses were tested at the .05 
level of significance, Departure from normality 

for, on the basis of the 
Norton study as discussed by Lindquist (6). 
ariance was “corrected” by 


of confidence to evaluate a hypothesis at the 
account for possible “infla- 


Matching of Controls and Experimentals 


The matching of control and e 


psychograms r mena 


z i y determinin: 
the main and simple effects of the two ieee 
ments. No significant differences were ob- 
tained for 16 comparisons, 
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Table 1 
Means, Standard Deviations, and Reliability Coefficients for 16 Preselected 
Rorschach and Behn Variables 
Behn Rorschach E* c 
eta cee = 
Variable M SD M SD f K 
wW 8.5 4.49 8.2 3.92 .32 .58* 
F 10.2 3.29 10.7 3.74 .16 Al 
A 11.4 2.23 10.9 2.66 24 21 
M 2.2 1.40 2.7 1.65 31 AS** 
FM 2.7 1.64 2:5 1.50 —.71* sr 
m 0.6 0.75 0.7 0.79 —.54* hae 
FSk 3.2 2.13 2.0 1.38 —.08 14 
ShF 0.7 0.85 0.7 0.79 —.01 19 
FK 1.3 1.14 0.2 0.50 —.04 .00 
KF 0.3 0.55 0.2 0.59 .64* — 08 
Fe 0.9 113 0.8 1.04 .16 .07 
FC’ 1.0 1.29 1.0 1.15 .70* 09 
FC 1.7 1.27 0.9 1.01 15 —.10 
CF 0.8 1.45 1.6 1.61 —.14 34 
RT 16.8 8.61 19.2 9.53 OT 33 
Rej 0.7 0.94 0.4 0.70 16 .64* 


a E = Reliability coefficients for experimental Ss, Behn vs. Rorschach. 
bC = Reliability coefficients for control Ss, Behn vs. Rorschach, 


* Significant at the .05 level of confidence. 
** Significant at the .10 level of confidence. 


Comparison of Behn and Rorschach 


Two of eight predicted differences between 
the Behn and the Rorschach were obtained, 
the Behn eliciting significantly more FK and 
FC responses. Three trends were also ob- 
tained, the Rorschach eliciting more CF and 
higher RTs, with the Behn eliciting more FSk 
responses. In this analysis, the evaluation of 
main effects was supplemented by determin- 
ing simple effects where significant interac- 
tions were obtained (6). The occurrence of 
two differences significant at the .05 level 
from among eight comparisons exceeds chance 
expectation (9). The null hypothesis for 
eight additional variables was upheld as pre- 
dicted. These results are presented in Table 2. 


Effect of Stress 


The null hypothesis was accepted for 14 of 
16 preselected Rorschach variables, the Behn 
and Rorschach protocols not changing differ- 
entially as a consequence of experimental psy- 
chological stress. Stress did differentially af- 
fect RT and FC, the Rorschach eliciting 
shorter RTs and less FC under stress than 
the Behn. There was also a tendency for stress 


to have a differential effect upon F, FM, and 
KF. The results of the analysis of variance 
are presented in Table 2. 

The occurrence of two significant differ- 
ences from among 16 comparisons does not 
appear to be a chance result with reference to 
“inflated probabilities.” One of the obtained 
F scores was significant at, the .01 level and 
the other was significant at the .05 level, 
yielding an estimated combined probability 
value at the .05 level of significance (9)- 


Effect of Manifest Anxiety Level 


The possible differential sensitivity of the 
Behn and the Rorschach to Manifest Anxiety 
Level scores was evaluated by testing the 
simple effects of anxiety level by the analysis 
of variance. In this analysis, the Rorschac 
elicited significantly more M responses in low 
anxious (M = 3.4, SD = 1.46) than in PA 
anxious Ss (M = 1.8, SD = 1.49); while t F 
Behn elicited significantly more FM response a 
in low anxious (M = 3.5, SD = 1-5) than e 
high anxious Ss (M = 1.9, SD = 1.38). ea 
ever, these two differences were obtained a 
among 32 comparisons and may be a chal 
occurrence (9). 


ee 
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Behn-Rorschach and Rorschach 


Table 2 


Analysis of Variance for 16 Preselected 
Rorschach and Behn Variables 


Source 
Be ABC® 
Variable F $ F Ż 
* 

w 0.07 2.63 

1 0.31 3.93 10 

A 0.40 0.04 

M 1.91 0.90 

FM 0.13 3.23 10 

m 0.13 0.48 

FSh 4.81* 10 1.99 

ShF 0.00 1.03 

FK 17.41 01 1.63 

KF 0.33 4.87* 10 

Fe 0.07 0.32 

FC’ 0.00 0.86 F 

FC 7.51 05 5.48 05 

CF 3.83 10 0.70 

RT 3.06 10 55.40 01 

Rej 2.29 1.17 


* A—First and second inkblot administration, 
B—Behn and Rorschach inkblot administration. 
C—Experimental (stress) and contro! (nonstress) groups. 
* Variance heterogeneous, p shifted from .05 to .10. 


Reliability Coefficients 


The coefficients reported in Table 1 were 
derived from the analysis of variance (6), as 
computed by Eichler (2). The control and 
experimental groups were treated as separate 
one-dimensional designs so as to separate the 
variance contributed by the stress procedure, 
The final coefficients should be interpreted 
cautiously, as they are limited by a N of 12, 
and by the fact that the experimental condi- 
tions included the use of matched homogene- 
ous groups with R kept constant, thereby re- 
stricting the range of variation. 


Discussion 


The Behn and Rorschach differed signifi- 
cantly on two variables from among eight pre- 
determined comparisons. The two tests did 
not differ on an additional eight variables. 
These findings indicate that the respective 
stimulus characteristics of the two inkblot 
tests are similar in most respects, and appear 
to be dissimilar in only a few. 

The differences in the two variables—FC 
and FK-——may be determined by the differ- 
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ences in the stimulus properties of the two 
tests. It appears that the relatively less struc- 
tured stimulus properties of the Rorschach 
make FC and FK responses less likely to oc- 
cur, in comparison to the more highly struc- 
tured Behn.? Further support is given to this 
hypothesis by a significant increase in the 
number of FC responses to the Behn than to 
the Rorschach as a consequence of stress. As- 
suming that there are certain changes in the 
S due to the stress condition, the Behn, origi- 
nally holding out the greater Possibility for 
FC responses, permitted the S in his presumed 
changed condition after stress to give still 
more FC responses. A similar evaluation may 
be made for the trend obtained with the KF 
variable. 

From the above discussion, one would ex- 
pect that the Rorschach, which tends to elicit 
higher RTs than the Behn under standard 
conditions, would also elicit higher RTs in 
Ss under stress. The obtained results, how- 
ever, are the reverse of this expectation. An 
explanation of this finding is beyond the scope 
of the present investigation, but it demon- 
strates that the stimulus Properties of ink- 
blots may interact with the experimental con- 
ditions in complex fashion. 

The discussed differences in the stimulus 
Properties of the Rorschach and Behn are 
congruent with the moderate reliability co- 
efficients reported by Eichler (2) and ob- 
tained in the present study. It should be noted 
that most of the nonsignificant coefficients in 
this study occur with shading and color vari- 
ables. While these Coefficients must be inter- 
preted with caution, they support the conclu- 
sion that the stimulus Properties of the two 
tests may be dissimilar for shading and color. 
These findings, subject to further verification 
limit the degree to which the two tests are 
congruent. 


It has also been suggested (5) that such 
coefficients may provide 
the degree to which some Rorschach cate- 
gories are stable. From 
should be noted that the 


results in significant reversals in the co 
. eth 
cients for FM and m, = 


‘This conclusion is also in agreement with 
th 
trend for the Behn to elicit more FSk responses ana 
for the Rorschach to elicit more CF responses. 
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to significant coefficients of KF and FC’, and 
a change from significant to nonsignificant co- 
efficients for W, F, M, and Rej. It is there- 
fore hypothesized that these seven variables 
may be especially relevant for an evaluation 
of the effect of stress on Rorschach perform- 
ance. The results for KF and FC’ also sug- 
gest that the unreliability of these categories 
in the control group may be due to their ex- 
treme sensitivity to situational stress. In addi- 
tion, significantly high coefficients were ob- 
tained in the control group for six variables, 
even though the test-retest conditions may be 
dissimilar in some respects. This finding sup- 
ports the stability of these variables in the 
evaluation of personality characteristics. 


Summary 


The present investigation was designed to 
investigate the correspondence of the Behn 
and the Rorschach inkblot series under stand- 
ard and under stress conditions for matched 
homogeneous groups. 

It was concluded that the Behn and Ror- 
schach are approximately equivalent for most 
response categories. It was also concluded that 
the obtained differences between the Ror- 
schach and Behn under standard and stress 
conditions may be attributed in part to dif- 
ferences in the stimulus properties of the two 
tests. 


Fred Schwartz and Solis L. Kates 


Some evidence was presented concerning 
the reliability of some Rorschach response 
categories in evaluating personality charac- 
teristics. 


Received November 12, 1956. 
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Visual and Verbal Presentation of TAT Stimuli 


Dell Lebo and Margaret Harrigan’ 


Richmond Professional Institute 


A previous publication by Lebo (11) indi- 
cated that TAT cards with the most consistent 
negative stimulus value were generally those 
described in negative terms by Murray in the 
TAT manual (12). Such a finding suggested 
that the card descriptions, if simply read to 
subjects, might elicit responses comparable in 
many ways to those obtained from the pic- 
tures themselves. 

If verbal descriptions of the TAT plates are 
capable of evoking responses similar to those 
given to the standard visual TAT card stimuli, 
several advantages become apparent. In the 
first place, the verbal test could be adminis- 
tered to groups of subjects without the neces- 
sity for sharing the pictures, and without some 
of the other difficulties encountered in group 
administration of the TAT. Verbal descrip- 
tions might also be easier to translate into a 
foreign language than having the pictures re- 
drawn to conform to another cultural group. 
Lastly, a verbal form of the TAT might be 
valuable as a projective test for blind persons. 
At present, only a severely limited number of 
adaptations of projective techniques are avail- 
able for such people. 


Problem 


It was the aim of the present investigation 
to study the results of a visual (card) and 
verbal (descriptive) presentation of TAT 
stimuli. Such investigation rested on two main 
postulates: 1. The descriptions of the TAT 
pictures furnished in the manual (12) were 
adequate descriptions of the cards. 2. Mate- 
rial elicited by responding to TAT stimuli 
could be measured and compared with mean- 


1The authors wish to express their gratitude to 
Drs. Donald P. Ogdon and Archer L. Michael for 
their assistance in obtaining subjects for this experi- 
Ment, 


ingful results. There seemed to be no reason 
why the principles of interpretation applied 
to the TAT could not apply equally well to 
the verbal material. 


The hypothesis for the present study was 
that verbal TAT stimuli would provide cues 
similar to those of the cards themselves; that 
the cards and their verbal descriptions would 
be similar sufficiently in their effect upon the 
subject to evoke comparable responses. 


Method 
Procedure 


Thirty-two female students, between 18 and 20 
years of age, selected from introductory courses in 
psychology, were used as subjects. Previous research 
had shown that the responses of college students to 
the TAT did not differ significantly from those of 
the general population and that IQ was not a dif- 
ferentiating factor (3, 5). All the subjects were tested 
individually by the junior author, 

All 20 cards for adult female subjects were pre- 
sented in single individual settings. Garfield et al. 
(10) found no significant differences between the re- 
sults of tests given in one or two sessions. The cards 
were divided so that alternate numbers were pre- 
sented in the standard visual manner, and the others. 
in the form of verbal descriptions as presented by 
Murray (12), read aloud by the examiner. The de- 
scriptions for cards 14 and 20 were altered slightly 
from those of Murray. The phrase “man (or woman)” 
was changed to read “person.” The use of both words 
might have seemed confusing, and the use of either 
one would have made the descriptions less ambiguous 
than the pictures. Each description was repeated 
twice. Standard instructions for card presentation 
from Stein's manual (14) were followed. For the 
presentation of the descriptions, suitable changes were 
made. Thus, “I am going to describe some scenes” 
was substituted for “I am going to show you some 
pictures.” < 

The order and manner of presentation was ri 
eg., ABCD, BCDA, CDBA, and DCBA, te nee 
counterbalance practice and fatigue effects, To keep 
the test material as nearly as possible in its original 
order, the first ten cards were Presented before the 
second ten, Lebo (11) had Suggested that it might 
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be emotionally disquieting to present a mixture of 
cards from both series. 


Evaluation of Data 


The data were evaluated by means of: (a) 
a word count; (b) an idea count (8); (c) a 
rating scale for emotional tone, or mood, of 
both the theme of the story and the outcome 
(9); (d) a rating scale for level of response, 
or the degree to which subjects included fac- 
tors required by directions and responded with 
feeling (15); and (e) the dynamic content (1). 
Since responses may be productive in quan- 
tity without being clinically significant, the 
amount of dynamic content should be a help- 
ful comparison of visual and verbal TAT 
stimuli. 

Normative data (5, 6, 9, 15) were also used 
as a basis for comparison with the data of the 
present investigation. 


Statistical Analysis 


Statistical treatment of projective material 
has long been a problem. Despite recent de- 
velopments and refinements, it still seems to 
be in an uncertain stage. This uncertainty is 
reflected in the present investigation in the 
inclusion of both parametric and nonpara- 
metric statistics. 


Results 


Results of an analysis of variance for the 
different methods of evaluation are presented 
in Table 1. When responses to verbal de- 
scriptions and to the plates themselves were 
compared, there was no significant difference 
attributable to the method of presentation ex- 


Table 1 


Analysis of Variance for Methods of Evaluation 


F for F for 
Method of Presen- F for Inter- 
Evaluation tation Cards action 
Word count 79 2.05** 1.46 
Idea count 43 2,10** 1.18 
Story mood 2.93 20.49** PA a 
Outcome mood 2.10 4.35** 1.32 
Level of response 3.68* 7.68** 91 
Dynamic content 14 21 .09 


Fx Significant at the .05 level of confidence. 
Significant at the .01 level of confidence. 


Dell Lebo and Margaret Harrigan 


Table 2 


Median Test for Significant Differences Between : 
Responses to Visual and Verbal Stimuli 


Method of | 


Evaluation x 
Word count 02 
Idea count 0.0 
Story mood 4.01* 
Outcome mood 42 
Level of response 0.0 
Dynamic content 1.69 


* Significant at the .05 level of confidence. 


cept for level of response, which varied toa — 
degree significant at the .05 level. As the level f 
of response was slightly higher for the verbal 
descriptions than for the pictures, this differ- 
ence suggested that responses to verbal de- 
scriptions were not inferior in quality to stand- 
ard responses. 

The results of the median test are presented 
in Table 2. Again, there did not appear to be 
much significant variation between responses 
to visual and verbal presentation of the 
stimuli, except, this time, in emotional tone; 
which differed significantly at the .05 level- 
One explanation for this finding is that the 
emotional tone of the pictures may not be 
completely or accurately conveyed in every 
verbal description. 

Product-moment and rank-order correlations 
with normative data are shown in Table 3 
On the basis of that table it may be said that 
the emotional tone of the stories (9), the level 
of response (15), and common themes 


Table 3 


Correlations of Visual Stimuli with Normative Data 
and with Verbal Stimuli 


Product-Moment Rank Order 

Visual Visual Visual Lr 

Method of and and and an 1 

Evaluation Norm Verbal Norm Verba 

* 

Story mood ist .795* 8308-708, 
Level of response 777*  .764* “694% — .69. 

Common themes 622*  .469* 


* Significant at .01 level of confidence. 


Visual and Verbal Presentation of TAT Stimuli 


were all elicited by verbal descriptions of the 
cards in a manner similar to those evoked by 
the pictures’ stimuli for the different cards. 

The agreement between judges’ rating of 
ten per cent of the data, randomly selected, 
ranged from 64% to 100% for the various 
methods of data evaluation. This agreement 
for a small sample rating seems fairly high, 
since, as Dana (3) has pointed out, agree- 
ment of 50 to 60 per cent is usual. Correlation 
coefficients for judges’ ratings ranged from 
.653 to 1.0, all significant at the .01 level of 
confidence. 


Discussion 


The results in general substantiated the hy- 
pothesis that verbal descriptions of TAT pic- 
tures evoked responses comparable to those of 
the TAT itself. Considering that the descrip- 
tions were not designed to elicit stories, but 
to identify the cards, the present writers feel 
that these findings are relatively encouraging 
for the utilization of the verbal test in fur- 
ther investigations, if not as an approxima- 
tion, of the pictorial test. 

It has been said that story telling is one of 
the oldest forms of the projective approach 
(13). Yet Bell has indicated that this method 
‘has “not progressed beyond the exploratory 
stages” (2, p. 71). Since “there are about as 
many ways of analyzing TAT stories as there 
are Clinical psychologists who use the method” 
(7, p. 125), and many of the ways should be 
‘applicable to verbal TAT stimuli, perhaps the 
‘storytelling approach may leave the explora- 
tory stage and enter the experimental, via the 
path of card description. 

Even though an examination of the data de- 
rived from the visual or pictorial TAT and 
the verbal TAT stimuli revealed no consistent 

_ pattern favoring either procedure, it must be 
realized that these findings are limited. The 
circumstances of this study served to insure 
optimal conditions in that the subjects were 

youthful students with a background stressing 

. reward through effort. Hence, they can be as- 
sumed to have cooperated to a maximum ex- 
tent. Only further investigation can safely in- 

q dicate the degree to which the findings of the 
present study can be generalized. More spe- 
cifically, the verbal TAT would probably pro- 

= duce very different protocols from the blind 
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than were obtained in this study. Other in- 
vestigations have shown that projective test 
results with the blind often present seemingly 
abnormal evidence when compared to stand- 
ard norms (4). 

Much more work remains before verbal ad- 
ministration of the test may truly be useful in 
group administration, for translation into for- 
eign tongues, and for adaptation for blind or 
partially sighted individuals. Some of this 
work is now in progress and should be re- 
ported in the literature from time to time. 


Summary 


Previous work by Lebo had suggested that 
Murray’s verbal descriptions of the TAT 
cards were, in some respects, similar to the 
cards themselves. The present experiment 
compared the responses of 32 female college 
students to TAT pictorial and verbal stimuli. 

It was found that the substitution of verbal 
description for visual plates was apparently 
justified, insofar as the present subjects were 
concerned. For when the two methods of 
presentation were compared objectively on 
several bases it was found that one method 
did not appear consistently superior to the 
other. Indeed, despite the fact that card de- 
scriptions were not devised to replace the 
cards, responses to the verbal descriptions 
were more like than unlike responses to the 
cards, according to the measures employed. 


Received October 29, 1956. 
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An Inventory for Assessing Different 


Kinds of Hostility* 


Arnold H. Buss and Ann Durkee 


Carter Memorial Hospital, Indianapolis 


In their everyday functioning, clinical psy- 
chologists are alert to the ways in which hos- 
tility is expressed, and they are usually care- 
ful to distinguish various modes of expression. 
When the aggression is overt and direct, a 
distinction is usually made between verbal 
hostility and physical assault. Overt mani- 
festations are clearly separated from covert 
manifestations of hostility, e.g., cursing and 
threatening behavior vs. gossiping and round- 
about derogation. 

Since meaningful distinctions can be made 
between subclasses of hostility, a global 
evaluation of hostility would seem to contain 
considerable ambiguity. The statement “He is 
hostile” would apply equally well to a man 
who beats his wife and to a man who is spite- 
fully late for appointments. Thus, it should 
be expected that attempts to assess hostility 
would include not only a global estimate of 
intensity but also estimates of the intensity 
of the various subhostilities. 

The writers know of no published hostility 
inventory that attempts more than a global 
estimate of hostility. Three of the more re- 
cently developed inventories, those of Cook 
and Medley (2), Moldawsky (1), and Siegel 
(9), all consist of items selected from the 


these investigators attempted to group items 
into subscales representing various aspects of 
hostility. Thus a nonsuspicious, assaultive in- 
dividual might receive the same score as a 
nonassaultive, suspicious individual. A score 
on one of these inventories would appear to 
be as ambiguous as the statement “He is hos- 


Piesi by clinical psychologists. None of 


1 The writers wish to acknowledge the considerable 
efforts of Dr. Herbert Gerjuoy in obtaining subjects 
and in facilitating statistical analyses. 


tile.” What is clearly needed is an inventory 
that attempts to assess the various aspects of 
hostility. This paper describes the develop- 
ment of such an inventory. 


Construction of the Inventory 
Varieties of Hostilities 


The first task was to define the subclasses 
of hostility that are typically delineated in 
everyday clinical situations. Such a classifica- 
tion was made in an earlier study (1), and the 
present classification is an elaboration of the 
previous one. 


Assault—physical violence against others. This in- 
cludes getting into fights with others but not de- 
stroying objects. 

Indirect Hostility—both roundabout and undi- 
rected aggression. Roundabout behavior like mali- 
cious gossip or practical jokes is indirect in the sense 
that the hated person is not attacked directly but 
by devious means. Undirected aggression, such as 
temper tantrums and slamming doors, consists of a 
discharge of: negative affect against no one in par- 
ticular; it is a diffuse rage reaction that has no di- 
rection. 

Irritability—a readiness to 
affect at the slightest provocation. This includes quick 
temper, grouchiness, exasperation, and rudeness, 

Negativism—oppositional behavior, usually directed 
against authority. This involves a refusal to cooperate 
that may vary from Passive noncompliance to open 
rebellion against rules or conventions. 

Resentment—jealousy and hatred of others. This 


refers to a feeling of anger at the world over real or 
fantasied mistreatment. 


Suspicion—projection of hostility onto others, This 
varies from merely being 


distrustful and w. 
people to beliefs that others are being Aaea, = 
are planning harm. 

Verbal Hostility—ney 
the style and content 
ing, shouting, and 
threats, curses, 


explode with negative 


gative affect expressed in both 
of speech. Style includes argu- 


screaming ;~ content includes 
and being overcritical. 
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Item-writing Techniques 


The writers constructed a pool of items and 
supplemented this pool with items borrowed 
from previous inventories. Most of the bor- 
rowed items underwent modification, and the 
following principles served as guides in writ- 
ing and selecting items. 


1. The item should refer to only one subclass of 
hostility, since an item that’ overlaps several cate- 
gories would not help in distinguishing patterns of 
hostility. 

2. The behaviors and attitudes involved should be 
specific, and the stimulus situations that arouse them 
should be near universal, e.g., “It makes my blood 
boil to have people make fun of me.” “Makes my 
blood boil” is a fairly specific response, and being 
ridiculed is a common situation for most people. 

3. The item should be worded so as to minimize 
defensiveness in responding. It has been established 
that social desirability accounts for much of the 
variance of normals’ responses to inventories (4, 5). 
In attempting to facilitate respondents’ admitting to 
socially undesirable behaviors, three item-writing 
techniques were employed: 

First, assume that the socially undesirable state al- 
ready exists and ask how it is expressed, e.g., “When 
I really lose my temper I am capable of slapping 
someone,” “When I get mad, I say nasty things.” In 
these items the loss of temper is assumed, and the 
subject is asked only whether he expresses it physi- 
cally. This procedure emphasizes a report of behavior 
and tends to minimize the value judgments associated 
with hostility. 

Second, provide justification for the occurrence of 
hostile behavior, e.g., “Whoever insults me or my 
family is asking for a fight,” “People who continually 
pester you are asking for a punch in the nose,” “Like 
most sensitive people, I am easily annoyed by the 
bad manners of others.” When the item provides a 
rationale for the aggression, the subject’s defensive 
and guilt reactions are reduced, and he does not 
necessarily answer in the direction of social desir- 

ability. 

Third, use idioms, e.g., “If somebody hits me first, 
I let him have it,” “When I am mad at someone, I 
will give him the silent treatment.” Idioms have a 
high frequency of usage in everyday life, and these 
phrases are typically used by subjects to describe 
their own behavior and feelings to others. It is an- 
ticipated that these phrases will merely echo what 
the subject has previously verbalized, and therefore 
when such phrases apply, they will be readily ac- 
cepted and admitted. 

4. Take into account the effects of response set 
by including both true and false items. If all the 
items were scored in the direction of hostility only 
when marked “True,” a subject could get a low score 
simply by answering all the items “False.” Ideally, 
this kind of response set is best controlled when the 
number of true items equals the number of false 
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items. However, such equality was not feasible be- 
cause of the difficulty of constructing false items that 
met the other criteria. Therefore, a compromise ratio 
of three true items to one false one was adopted. 


On the basis of the foregoing considerations, 
a pool of hostility items was compiled. Next 
it was decided to add the variable of guilt be- 
cause the relationship of guilt to the various 
subhostilities is of clinical interest. Accord- 
ingly, items were compiled for a Guilt scale, 
with guilt being defined as feelings of being 
bad, having done wrong, or suffering pangs of 
conscience. 


Item Analyses 


The first version of the inventory consisted 
of 105 items, with items from each scale ran- 
domly scattered throughout the inventory. It 
was administered in group fashion to 85 male 
and 74 female college students. In an attempt 
to reduce defensiveness, all protocols were 
anonymous. The various hostility scales and 
the Guilt scale were scored, and separate item 
analyses were performed for men and women. 

Two criteria were used in item selection: 
frequency and internal consistency. Frequency 
refers to the occurrence of the particular be- 
havior in the population, as measured by the 
proportion of the sample answering in the di- 
rection of hostility (or guilt). If a given be- 
havior is near-universal in the population oF 
virtually absent, it obviously does not distin- 
guish between individuals. A criterion of fre- 
quency is necessary to eliminate items that 
are answered in one direction by virtually 
everyone, and it was decided to accept only 
items answered in one direction by 15-85% 
of the sample. 

Internal consistency was measured by the 
correlation of an item with the score of the 
scale in which it belonged. Since the items a! 
scored dichotomously, the biserial correlatio” 
coefficient was used. The criterion for item 
selection was a correlation of at least -40 {on 
both the male and female samples. é 

Only 60 of the original 105 items met pa 
frequency and internal consistency eriten 
The number of items in several of the scê 
was so low that unreliability (lack of ae 
retest stability) seemed inevitable. There!” g 
additional new items were written an 


> . ere 
ones modified. Most of the modifications W 
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attempts to alter the frequency measure, i.e., 
decrease the Popularity of items universally 
endorsed and increase the popularity of items 
rarely endorsed. 

The revised inventory contained 94 items. 
It was administered in group fashion to 62 
male and 58 female college students, and 
Separate item analyses were performed for 
each sex. Again the minimum item-scale cor- 
relation was set at .40, but this time the 
frequency criterion was modified. The first 
item analysis had revealed sex differences in 
the proportion of the sample answering in the 
direction of hostility (or guilt). For several 
items the proportion of male students was 
over 15%, but the proportion of female stu- 
dents was under 15%. Since the 15-85% fre- 
quency criterion might eliminate items that 
differentiated between men and women, a less 
stringent criterion was adopted: 15-85% for 
either men or women. In addition, an attempt 
was made to insure that each scale contained 
items whose frequencies varied over a wide 
range. 

The second item analysis yielded 75 items, 
66 for hostility and 9 for guilt. It was found 
that more False items were discarded than 
True items, and the final form of the inven- 
tory contains 60 True items and 15 False 
items, a ratio of four to one. The items com- 
prising the final form of the inventory are 
listed in Table 1. Each item is grouped with 
the other items in its scale, and the False 
items are marked “F.” 


Social Desirability 


Responses to inventory items are at least in 
part determined by the respondent’s desire to 
place himself in a favorable light. This tend- 
ency assumes great importance in a hostility 

\, inventory, which deals with behaviors that are 
V generally regarded as socially unacceptable, 
| The potency of the tendency to give socially 
)’ desirable answers has been demonstrated by 
_ Edwards (4). He had college students assign 
each of 140 personality trait items to one 
of nine intervals of social desirability. Scale 
values for social desirability were obtained by 
the method of successive intervals, Then the 
140 items were administered to different col- 
lege students, with standard inventory instruc- 
tions. The correlation between social desir- 
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ability and probability of endorsing the items 
was .87. Subsequent studies with other inven- 
tories have confirmed the fact that social de- 
sirability is an important uncontrolled vari- 
able in many present-day inventories (5, 8). 

In constructing the present inventory, an 
attempt was made to minimize the variable of 
social desirability. In order to test the success 
of this attempt, Edwards’ procedure (4) was 
followed. The 66 hostility items of the final 
inventory were scaled for social desirability, 
using the method of successive intervals. The 
judges were 85 male and 35 female college 
students. The men’s and women’s judgments 
were quite similar, and they were pooled, 
Next, the inventories of 62 men and 58 women 
(who had previously taken the inventory and 
were different from the judges) were used to 
determine the probability of endorsement for 
each of the 66 hostility items. The product- 
moment 7’s were .27 for the men and .30 for 
the women. Both correlations are significantly 
above zero at the .05 level of Confidence, 
which suggests that the influence of social de- 
Sirability is having a small but significant ef- 
fect on the direction of responding, 

However, these two correlations are consid- 
erably lower than the Correlation of .87 re- 
ported by Edwards (4). In accounting for this 
discrepancy two differences between his study 
and the present one should be noted. First 
the present i 3 


extremely desirable behaviors. 

The restriction of range can be clearly seen 
when social desirability scale values of the 
two inventories are compared. The scaling 
procedures were identical, but the present 
range of scale values was 23 to 2.38 while 
Edwards range was .50 to 4.70.2 A cur- 

2 Edwards’ scale values for social desirability and 


his probability of endorsement val i 
from his Fig? a, ues were estimated 
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Table 1 
Items Comprising the Hostility-Guilt Inventory * 
(F = False items) 
Assault: 4, When people are bossy, I take my time just to 


1. Once ina while I cannot control my urge to harm 
others. (9) Pe 
2F. I can think of no good reason for ever hitting 
anyone. (17) g A 
3. If somebody hits me first, I let him have it. (25) 
4. Whoever insults me or my family is asking fora 
fight. (33) : 
5. People who continually pester you are asking for 
a punch in the nose. (41) 
6F. I seldom strike back, even if someone hits me 
first. (1) 
7. When I really lose my temper, I am capable of 
slapping someone. (49) 
8. I get into fights about as often as the next person. 
(57) 
9. If I have to resort to physical violence to defend 
my rights, I will. (65) 
I have known people who pushed me so far that 
we came to blows. (70) 


10. 


Indirect: 


1. I sometimes spread gossip about people I don't 
like. (2) 

2F. I never get mad enough to throw things. (10) 

3. When I am mad, I sometimes slam doors. (26) 

4F. I never play practical jokes. (34) 

5. When I am angry, I sometimes sulk. (18) 

6. I sometimes pout when I don’t get my own way. 
(42) 

7F. Since the age of ten, I have never had a temper 
tantrum. (50) 

8. I can remember being so angry that I picked up 
the nearest thing and broke it. (58) 

9. I sometimes show my anger by banging on the 
table. (75) 


Irritability: 
1. Ilose my temper easily but get over it quickly. (4) 
2F. I am always patient with others. (27) 
3. Iam irritated a great deal more than people are 
aware of. (20) 
4, It makes my blood boil to have somebody make 
fun of me. (35) 
5F. If someone doesn’t treat me right, I don’t let it 
annoy me. (66) , : 
6. Sometimes people bother me just by being around. 
(12) 
7. Loften feel likea powder keg ready to explode. (44) 
8. Isometimes carry a chip on my shoulder. (52) 
9. I can’t help being a little rude to people I don't 
like. (60) Í EN 
10F. I don’t let a lot of unimportant things irritate 
me. (71) 
11. Lately, I have been kind of grouchy. (73) 


Negativism: 
1. Unless somebody asks me in a nice way, I won't 
do what they want. (3) a 
2. When someone makes a rule I don’t like I am 
tempted to break it. (12) K 
3. When someoue is bossy, I do the opposite of what 
he asks. (19) 


show them. (36) 
5. Occasionally when I am mad at someone I will 
give him the ‘‘silent treatment." (28) 


Resentment: 


1. I don’t seem to get what's coming to me. (5) 

2. Other people always seem to get the breaks. (13) 
3. When I look back on what's happened to me, I 
can't help feeling mildly resentful. (29) 

4, Almost every week I see someone I dislike. (37) 
5. Although I don't show it, I am sometimes eaten 

up with jealousy. (45) 
I = know any people that I downright hate. 
21) 
7. If I let people see the way I feel, I'd be consid- 
ered a hard person to get along with. (53) 
8. At times I feel I get a raw deal out of life. (61) 


6F. 


Suspicion: 


1. I know that people tend to talk about me behind 
my back. (6) 
2. I tend to be on my guard with people who are 
somewhat more friendly than I expected. (14) 
3. There area number of people who seem to dislike 
me very much. (22) 
4. There are a number of people who seem to be 
jealous of me. (30) 
5. I sometimes have the feeling that others are 
laughing at me. (38) 
6. My motto is “Never trust strangers.” (46) 
7. I commonly wonder what hidden reason another 
person may have for doing something nice for 
me. (54) 
8. I used to think that most people told the truth 
but now I know otherwise. (62) 
9F. Ihaveno enemies who really wish to harm me. (67) 
10F. I seldom feel that people are trying to anger °" 
insult me. (72) 


Verbal: 


re 


When I disapprove of my friends’ behavior, J let 
them know it. (7) 5) 
2. I often find myself disagreeing with people- a fi 
3. I can’t help getting into arguments when pe°P’ 
disagree with me. (23) 
4. I demand that people respect my rights. (31) se 
SF. Even when my anger is aroused, I don’t U 
“strong language.” (39 i 
6. If somebody annoys ey I am apt to tell pa 
what I think of him. (43) 
7. When people yell at me, I yell back. (47) 
8. When I get mad, I say nasty things. (51) 
9F. I could not put someone in his place, €V 
needed it. (55) 
10. I often make threats I don’t really mean 
out. (59) š (68) 
11. When arguing, I tend to raise my voice. ` thers 
12F. I generally cover up my poor opinion ato 
(63) sto a 
13F. I would rather concede a point than get at 
argument about it. (74) 


n if he 


to carry 


* The numbers in parentheses indicate the sequence of items in the mimeographed form of the inventory, 
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Table 1—Continued 


1. The few times I have cheated, I have suffered 
unbearable feelings of remorse. (8) 

2. I sometimes have bad thoughts which make me 
feel ashamed of myself. (16) 

3. People who shirk on the job must feel very guilty. 
(24) 

4. It depresses me that I did not do more for my 
Parents. (32) 

5. Iam concerned about being forgiven for my sins, 

0) 


6. Ido many things that make me feel remorseful 
afterward. (48) 

7. Failure gives mea feeling of remorse. (56) 

8. When I do wrong, my conscience punishes me 
severely. (64) 

9. I often feel that I have not lived the right kind 
of life. (69) 


ee 


tailed distribution decreases the magnitude of 
a correlation coefficient, but it is possible to 
adjust for a difference in standard deviations 
(7, pp. 149-150). When Edwards’ correlation 
of .87 between social desirability and prob- 
ability of endorsement is adjusted to the pres- 
ent range of values, it becomes .74. There is 
still a large disparity between Edwards? cor- 
rected correlation of .74 and the present ones 
of .27 and .30, and the curtailment of the 
range of social desirability evidently accounts 
for only a small part of the discrepancy. 
The second difference between the studies 
lies in the construction of the present inven- 
tory. The writers were aware that social de- 
sirability might influence inventory responses, 
and attempted to minimize its effect by: (a) 
assuming that anger was present and inquir- 
ing only how it is expressed; (b) providing 
justification for admitting aggressive acts; 
and (c) including cliches and idioms that 
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would find ready acceptance. On the other 
hand, Edwards used a list of unelaborated 
personality trait names, and there was no at- 
tempt to manipulate the wording of the items. 
Thus, the present low correlations between 
social desirability and probability of endorse- 
ment would seem to reflect the success of the 
item construction techniques used in the pres- 
ent study. š 

Previous attempts at controlling social de- 
sirability have taken two forms. The first is 
to develop suppressor variables like the “va- 
lidity” scales of the MMPI (6). The second 
approach is to scale items for social desir- 
ability and then use a Paired comparisons 
type of inventory, in which each item is paired 
with another item of matched social desir- 
ability (3). The Present study suggests a third 
approach, that of focusing on the Process of 
item construction, Perhaps the influence of 
social desirability can be substantially reduced 
or eliminated at the source, i.e., in the actual 
wording of the item. 


Factor Analyses 


were scored, and product-moment Correlations 


; and women sepa- 
rately. The correlation matrices are presented 
in Tables 2 and 3. None of the women’s cor- 


the men’s correla- 


Table 2 
Table of Intercorrelations for Men (V = 85) 
Indirect 
Variable Assault Hostility Irritabili ivi Verbal 
y tritability Negativism Resentment Suspicion Hostility 
Indirect Ho 28 
Irritability 32 44 
Negativism 30 27 .20 
Resentment AT 33 44 31 
Suspicion ll 27 26 38 58 
Verbal Ho 40 40 66 2 
Guilt > A aL 
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Table 3 
Table of Intercorrelations for Women (N = 88) 
Indirect Verbal 

Variable Assault Hostility Irritability Negativism Resentment Suspicion Hostility 
Indirect Ho -38 
Irritability 30 31 
Negativism -21 .34 .29 
Resentment .14 23 30 23 

Suspicion ay Wek 19 30 15 AS 

Verbal Ho Eri 19 44 30 .22 21 

Guilt —.07 05 16 01 33 .27 10 


for men and women were rotated to the same 
simple structure so that the factor loadings of 
the two sexes would be comparable. These 
factor loadings are presented in Table 4. 

If only factor loadings of .40 and over are 
considered meaningful, the first factor is de- 
fined by Resentment and Suspicion for men, 
and by Resentment, Suspicion, and Guilt for 
women. The second factor is defined by As- 
sault, Indirect Hostility, Irritability, and Ver- 
bal Hostility for both sexes, with the addi- 
tion of Negativism for women. However, both 
Guilt and Negativism had positive loadings 
on their respective factors for the men, also, 
and the sex differences just noted are slight. 
In fact, the men’s and women’s factor load- 
ings are generally similar, differences being 
small and random. Since the same axes were 
used for men and women, this similarity of 
factor loadings suggests that the factor struc- 
ture is stable. 

The two factors extracted from the inter- 
correlation matrix divide hostility into an 


Table 4 
Rotated Factor Loadings for Men and Women 


Men Women 

Variable I il # foil # 
Assault AF ahh 27 19 61 .38 
Indirect Hostility .19 40 .37 00 48 .38 
Irritability IL ST 0 .14 .47 44 
Negativism .23 .22 .25 —.03 .48 .34 
Resentment J59 .12 .55 7.57 .04 45 
Suspicion v-66 —.02 .60 <54 02 45 
Verbal Hostility 05 63 .64 04 49 44 
Guilt “29 03 .14 50 .28 .33 


“emotional” or attitudinal component (“Peo- 
ple are no damn good”) and a “motor” com- 
ponent that involves various aggressive behav- 
iors. However, it should be noted that the 
factor loadings are not high. The average 
communality of the eight variables was .43 
for men and .40 for women, leaving consider- 
ably more than half of the test variance un- 
explained. Some of this specific variance may 
be attributed to unreliability of the scales 
(especially since they are short), but there 
seems to be much variance that is stable and 
unique. 

The presence of unique variance is not sur- 
prising, since it seems likely that there are 
more than two components of hostility. For 
example, the second factor includes both As- 
sault and Verbal Hostility, yet there are ob- 
viously many verbally hostile individuals who 
are not assaultive. Similarly, with respect tO 
the first factor, resentment may be seen in 
the absence of distrust and suspicion. The 
Presence of unique variance would seem to 
reflect the presence of these patterns within 
each factor. 

The population used in deriving the. two 
factors was normal, but the factors appear tO 
have relevance for clinical populations. For 
example, the characteristics associated witi 
paranoid personalities suggest that such indi- 
viduals would score high on Resentment an 
Suspicion (Factor I) and low on’ the other 
scales. On the other hand, hysterical person- 
alities should score low ‘on Resentment an 
Suspicion and high on Irritability, Negativ- 
ism, and Verbal Hostility. In both instances; 
no prediction can be made concerning Se 
since this variable is thought to be related 
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Table 5 


Means and Standard Deviations for College 
Men and Women 


Men Women 
No. 

Variable Mean SD Items Mean SD 
Assault 5.07 2.48 10 3.27 2.31 
Indirect Hostility 4.47 2.23 9 5.17 1.96 
Irritability 5.94 2.65 11 6.14 2.78 
Negativism 2.19 1.34 $ 2.30 1.20 
Resentment 2.26 1.89 8 1.78 1.62 
Suspicion 3.33 2.07 10 2.26 1.81 
Verbal Hostility 7.61 2.74 13 6.82 2.59 
Guilt 5.34 1.88 9 4.41 2.31 
Total Hostility 30.87 10.24 66 27.74 8.75 


the variables of sex, socioeconomic status, psy- 
chopathology, etc. 


Norms 


The collection of normative data for a new 
instrument is a long-time endeavor. In the 
present instance the process has just begun. 
Norms are being collected for clinical popula- 
tions, and the construct validity of the inven- 
tory is being investigated. At present, the only 
norms available are for the 85 college men 
and 88 college women who were administered 
the final form of the inventory. The means 
and standard deviations of these two groups 
are presented in Table 5.-Since these samples 
are small and not representative, the norms 
must be regarded as highly tentative, 


Summary 


This paper described the construction of an 
inventory Consisting of the following scales: 
Assault, Indirect Hostility, Irritability, Nega- 
tivism, Resentment, Suspicion, Verbal Hos- 
tility, and Guilt. The first and second versions 
of the scale were item analyzed, and the final 
revision consists of 75 items. 
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The hostility items were scaled for social 
desirability, and social desirability was cor- 
related with probability of endorsement. The 
7’s of .27 and .30 for college men and women, 
respectively, were considerably smaller than 
those of previous studies. The reduction in the 
effects of social desirability was attributed to 
item-writing techniques. 

Factor analyses of college men’s and wom- 
en’s inventories revealed two factors: an atti- 


component 
Irritability, and 


tors to the study of abnormal as well as nor- 
mal personalities was illustrated. 
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The Deliberate Use of a Set to “Fake” in 
Personality Questionnaires’ 


Marshall B. Jones 
U. S. Naval School of Aviation Medicine, Pensacola, Florida ? 


Test-taking attitudes pose a central prob- 
Jem in the measurement of personality. The 
forced-choice technique and special scales, 
like the K Scale of the MMPI, represent two 
familiar approaches to this problem. This 
note is concerned with still another approach 
which, though obvious enough, has received 
little attention. 

Some scales correlate very well with them- 
selves when administered under a set to 
“fake.” Specifically, the subjects are asked to 
respond to each item not only as they think 
they are but as they think they should be. 
The test key is then applied to both sets of 
responses. The correlation between the two re- 
sultant scores, self-descriptive (s-d) and ideal- 
descriptive (i-d), will often run as high as the 
mid-.50’s. Suppose, however, that a pool of 
i-d items were analyzed against an s-d scale 
which was already well correlated with its 
i-d parallel. Conceivably an i-d predictor scale 
could be developed which would correlate well 
enough with the s-d scale to serve as a second 
form. Were this possible, the problem of test- 
taking attitudes, at least for the scale at issue, 
would be solved; we could simply use the i-d 
scale. 

To test this possibility, the 197-item Pensa- 
cola Z Survey (1) was administered to two 
samples of 264 and 325 naval aviation cadets. 
The cadets were instructed to give both the 
s-d and the i-d response to each item. Four of 


1An extended report of this study may be ob- 
tained without charge from Marshall B. Jones, 418 
South Second Street, Pensacola, Florida, or for a fee 
from the American Documentation Institute. Order 
Document No. 5272, remitting $1.25 for microfilm or 
$1.25 for photocopies. 

2 The contents of this note are not to be construed 
as necessarily reflecting the view of the Navy De- 
partment. 
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Table 1 


Correlations of the Z Survey Scales with Their i-d 
Parallel and Predictor Scales 


Sample 1 Sample 2 

Rr ie Bees 

Scales 1 2 3 4 
Heteronomy j -70 70 54 46 
Dependency 65 69 57 -51 
Rigidity 50.62 45 M 
Anxiety .07 57 30 27 
Hostility 46 62 42 38 


length, 40 items, and have no item overlap 
with each other. The fifth, Heteronomy, has 
66 items, and overlaps all four of the other 


the scales of the Z Survey are of the same | 


scales. In the first sample of 264 cadets; all 
197 i-d items were analyzed against each ° 
the s-d scales. Any item which correlated at 
the .05 level with a given s-d scale was 1° 
cluded in a corresponding i-d predictor scale: 
In Table 1 the correlations of the s-d scales 
with their i-d parallel scales (Columns 1 a” 
3) and with their i-d predictor scales (Col 
umns 2 and 4) appear. Upon cross validatio® 
the i-d predictors correlate no better, if not 4 
little worse, than do the i-d parallel scales- 
The burden, therefore, of this note is neg% 
tive. The i-d parallel scales seem already 1 
have absorbed so much of the variance ava- 
able to i-d items that no increase obtains ro” 
ordinary item-analytic procedures. 

Brief Report. 

Received March 21, 1957. 
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Differential Qualitative Performance of Delinquents 
on the Porteus Maze 


Gilbert Fooks 
Hartford Regional Technical High School 


and Ross R. Thomas 


Newington Home and Hospital, Connecticut 


Several studies (1, 4, 6) have reported that 
the Qualitative (Q) score of the Revised 
Porteus Maze significantly differentiates de- 
linquents from nondelinquents. The present 
study was undertaken as a further cross 
validation of these results as well as to deter- 
mine the efficacy of the new Maze Extension 
in discriminating between normal and delin- 
quent adolescents. 


Subjects and Procedure 


Twenty-five girls and twenty-five boys from 
Connecticut institutions for delinquents were 
matched with nondelinquent high school sub- 
jects from the New London High School on 
the basis of age, sex, intelligence, and socio- 
economic level of parents. Table 1 contains 
the matching data for age and intelligence. 
The intelligence estimates available for the 
delinquent group were from the Wechsler- 
Bellevue and Stanford-Binet tests; the Otis 
Group Intelligence Test was the measure 
available for the nondelinquent group. Match- 
ing on the basis of parental occupation was 
less successful, the nondelinquent children 
representing a generally higher socioeconomic 
group. 

All of the delinquents were diagnosed as 
psychopaths or having psychopathic tenden- 
cies by the institution’s psychiatrists. The 
high school subjects were judged by the school 


1 The authors wish to express their appreciation to 
the psychiatric and social service Staffs at the Long 
Lane School for Girls, and the Meriden School for 
Boys, and to the Principal and Deans at the New 
London High School. 


administrators to be normal high school stu- 
dents with no record of antisocial behavior, 
Both groups were tested with the Revised 
Maze and the new Maze Extension, the for- 
mer being given first in all cases. Scoring 
on the Revised Maze was as prescribed in 
the manual (4) with the exception of “Lift 
Pencil” which, on the advice of Porteus,? was 


Table 1 


Age and Intelligence of Experimental and 
Control Groups 


Delinquents Normals 
Se SE 
Measure Boys Girls Boys Girls 
N 25 25 25 25 
Mean Age 14.5 15.2 14.6 15.3 
AgeRange 13-162 13-169 14164 13.1-16.1 
MeanIQ 96.4 95.4 97 90.3 
IQ Range 79-143 82-124 81-137 82-126 


suggestion þ 
Porteus to score the “Wavy Line” error on FA 


Extension by referring to the criteria avail 
for the Revised Maze. TRR 


Results and Discussion 
In Table 2 are given the result 


S of the pres- 
ent study along with those of three Devona 


2 Personal communication. 
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Table 2 
Q-Score Means, Sigmas, and Significance of Difference Between Delinquent and Normal Groups 
Delinquents Normals i 
Study N Mean SD N Mean SD p 
Porteus (4) 100 49 25 100 22 13 0001 
Porteus (4) 50 48 22 179 19 13 0001 
Docter & Winder (1) 60 47 26 6 25 20 .001 
Present (Revised Maze) 50 40 18.4 50 22.2 14.8 001 i 
Present (Extension) 50 464 20.1 50 278 17.3 .001 a 


entiated between the two groups, thus con- 
firming the results previously reported. No sex 
differences were found on the Q score indicat- 
ing that previous findings may be generalized 
to female groups. 

It was found that Porteus’ critical cutoff 
score of 29 correctly identified 76% of the 
delinquents, and wrongly identified only 28% 
of the controls. This cutoff score is based on 
the weighted scoring system devised by Por- 
teus. Docter and Winder (1), using a non- 
weighted system in addition to the Porteus 
system, concluded that the more time-con- 
suming weighted system was not as efficient 
as the nonweighted system. Application of 
their unweighted score cutoff of 16 to the 
present group correctly identified 74% of the 
delinquents and incorrectly identified 28% of 
the control group. It may be concluded, there- 
fore, that, at least with regard to distinguish- 
ing between delinquents and nondelinquents, 
the unweighted Q score is not only easier to 
determine but is nearly as efficient as the 
Porteus system. Table 3 shows rates of identi- 
fication using both cutoff scores on the Re- 
vised Maze and on the Maze Extension, to- 
gether with a summary of previous findings 
by other investigators. 

If further studies of the Maze Extension 
bear out our findings, it would appear that the 
use of the Extension for detection or diagno- 
sis of delinquency would depend on the rela- 
tive importance of a high true-positive rate 
ys. a high false-positive rate. , 

In support of the nonweighted scoring sys- 
tem, Docter and Winder determined that only 
four of the eight error scores, when applied 
singly, differentiated delinquents from non- 
delinquents. In our study, only one individual 


error score, that of the incidence of “Wavy F 
Lines,” discriminated significantly between the — 
two groups. The average delinquent in our 
study obtained a score of 4.5 for this error as» 
compared to an average of only 1.16 for the 
controls (probability of chi square less than » 
.05). Thus, neither study indicates sufficient 
discrimination on the basis of individual error _ 
scores to warrant individual use of a single 
error score. 

In order to further determine whether in- 
tellectual ability per se had any effect on the 
Q score, the correlation of Porteus Quantita- « 
tive (a measure of intelligence) and Q scores 
was computed. The resultant 7’s are given in | 
Table 4. Three of the four r’s are not signifi- 
cant and are lower than those reported bY | 
Porteus (4). Thus the Q score appears to be 


Table 3 


Percentages of Delinquents and Normals Above Cutoff 7 
Scores Determined by Weighted and à 
Unweighted Scoring Systems 


Delinquents Normals “| 
Cut- % Cut- % 
Study off Above off Abov? 
Weighted + 
Porteus (4) 29 80 29 2 
Wright (6) 1. = 18 = ‘fo 
Docter & Winder (1) 29 70 29 8 
Present (Revised) 29 76 2 2 
Present (Extension) 29 74 29 
Unweighted 
Docter & Winder (1) 16 not 16 nee 
given a 
p 28 
Present (Revised) 16 74 16 ae 
Present (Extension) 16 834 16 


| 


Qualitative Performance of Delinquents on Porteus Maze 


Table 4 
Relationship of Quantitative Score to Q Score 
Delinquents Normals 
Study r 2 r 4 
Porteus (4) —.44 .057 
Present (Revised) —.03 ns —.004 ns 
Present (Extension) —.03 ns —39 0i 


ae ee ee ee 


generally independent of intellectual ability, 
and use of the Q score as an independent 
measure is supported. 
Further evidence on the reliability of scor- 
ing was obtained. A correlation of .98 was ob- 
` tained for independent scorings of 50 of the 
_ completed tests by the authors. The same cor- 
relation has also been reported by Docter and 
Winder (1). 


Summary and Conclusions 


1. Groups of 50 delinquents and 50 non- 
delinquents were given the Revised Porteus 
Maze Test and Extension. Qualitative (Q) 
scores on both tests significantly differenti- 
ated between delinquents and nondelinquents 
(p< 001). 
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2. No sex differences were found on the Q 
score, indicating that previous results may be 
generalized to females. 

3. Results of the present study support the 
hypothesis that no significant relationship 
exists between intelligence, as estimated from 
the Porteus Quantitative score, and Q score. 

4. Evidence is reported which suggests that 
a nonweighted scoring system is nearly as 
efficient as the present weighted system of 
scoring. 

5. Interscorer reliability was found to be 
satisfactory for the Q score (r = .98). 


Received October 29, 1956. 
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Manifest Anxiety and Projective and Objective 
Measures of Need Achievement 


A. W. Bendig* | 


University of Pittsburgh 


Two of the most popular psychometric 
measures of drives and/or motives are Tay- 
lor’s Manifest Anxiety Scale (4) and McClel- 
land’s projective need Achievement scale (2). 
Raphelson (3) has reported a nonsignificant 
correlation of — .25 between these scales for 
a small sample (NW = 25) of college Ss. No 
reports have appeared to date on the relation- 
ship between McClelland’s projective measure 
of “need Achievement” and the objective scale 
purportedly measuring “need Achievement” 
which is included in Edwards’ Personal Pref- 
erence Schedule (1). 

The above three scales were administered 
to 244 students (136 men and 108 women) 
enrolled in a course in introductory psychol- 
ogy. The raw scores from the scales were 
converted to stanines on the basis of norms 
developed from other groups of Ss and the 
product-moment correlations among the scales 
computed for men and women Ss separately. 
The differences between the correlations from 
the two sex groups were found to be statisti- 
cally not significant. Consequently, the cor- 
relations were averaged by the usual r-to-z 
method. 

The correlation between the MAS and Mc- 
Clelland’s n-Ach scale was .06 while the cor- 


1An extended report of this study may be ob- 
tained without charge from A. W. Bendig, Dept. of 
Psychology, University of Pittsburgh, Pittsburgh 13, 
Pa., or for a fee from the American Documentation 
Institute. Order Document No. 5276, remitting $1.25 
for microfilm or $1.25 for photocopies. DA 

2 The author wishes to express his appreciation to 
Messrs. Jack Dunsing and Oakley Ray for their 
administration and scoring of the projective need 
achievement test. 
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relation of the MAS and Edwards’ n-Ach ' 
scale was — .05. Neither of these two correla- 
tions approaches statistical significance (N = 
244). The correlation between the projective 
and objective measures of n-Ach was 
which is not significant at the .05 level, but 
is barely significant at the .10 level of con- 
fidence. A 

The correlations of the three major vari- 
ables with a measure of verbal ability were 
also calculated. All three correlations were 
nonsignificant: — .01, .02, and .03. 

The results indicate that the three scales 
are measuring quite independent traits. The 
small correlation between the two measures 
of “need Achievement” suggests either that, | 
each scale measures a different type of “neet | 
Achievement,” or that the low stability reli- 
ability of McClelland’s scale results in co?” 
siderable attenuation of the relationship. 


Brief Report. i 
Received April 29, 1957. 
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The Effectiveness of the Bender-Gestalt 
in Differential Diagnosis’ 


Arthur S. Tamkin 


VA Hospital, Northampton, Massachusetts 


Since the introduction of the Visual Motor 
Gestalt test by Lauretta Bender (2) in 1938, 
there has arisen considerable interest in its 
use as a differentially diagnostic instrument 
for psychiatric disorders. While its efficacy in 
identifying cases of organic brain disease, in 
which there is a disintegration of visual motor 
functions, has been considered to rest on firm 
ground, the question of its applicability to 
the functional mental disorders has not been 
settled. Bender found deviations of visual mo- 
tor Gestalt patterns in her studies of schizo- 
phrenic children and adults, but because the 
personality disturbances of psychoneurotics 
seldom invade their visual motor sphere, she 

_did not find their records to show deviations. 
Hutt (6), however, was able to delineate 
characteristic distortions in the Bender draw- 
ings of schizophrenics and psychoneurotics 
which distinguished these clinical groups from 
each other and from patients with organic 
brain damage. Billingslea (3) found no sup- 
Port for Hutt’s proposed psychoneurotic signs, 
nor was Hanvick (5) able to differentiate be- 
tween psychoneurotics with functional back- 
ache and control patients with proven organic 
disease of the back. 

With the introduction of an objective scor- 
ing method by Pascal and Suttell (8) suc- 
cessful separation between clinical groups with 
functional diagnoses has been reported. In ad- 
dition to Pascal and Suttell, Lonstein (7) and 
Bowland and Deabler (4), all using the same 


1 From the Veterans Administration, Northampton, 
Massachusetts. The author wishes to thank the fol- 
lowing members of the Clinical Psychology Service 
for their constructive review of the manuscript: Drs. 
Isidor Scherer, Arnold Trehub, Cesareo D. Peña, and 
C. James Klett. 


scoring method, reported discrimination þe- 
tween hospitalized psychotics and nonpsy- 
chotic psychiatric patients at high levels of 
statistical significance. Because of Pascal and 
Suttell’s findings that age did not materially 
affect scoring levels within the age range of 
15 to 50 years, controls for age were ignored 
in these studies. Furthermore, the factors of 
education and chronicity were not uniformly 
controlled by these experimenters. The pres- 
ent study was an attempt to cross validate the 
findings that the Pascal and Suttell scoring 
method of the Bender-Gestalt differentiates 
significantly between functionally Psychotic 
patients and those with nonpsychotic, func- 
tional mental disorders. 


Procedure 


; The subjects (Ss) used in this study con- 
sisted of a group of 27 psychotics and a group 
of 27 neurotics and Personality disorders 
matched on the basis of age. They were all 
male Patients at the Veterans Administration 
Hospital, Northampton, Massachusetts who 
had taken the Bender-Gestalt in conjunction 
with other Psychological tests for routine psy- 
chodiagnostic evaluations, Except for a few 
they had been tested shortly after their ar- 
rival as new admissions or readmissions, and 
their diagnoses, representing functional psy- 
chiatric disorders, were established later by 
neuropsychiatric staff conferences. All Ss had 
sufficient education to permit the computation 
of Pascal and Suttell’s z score; that is, at 
least one year of high school. The Psychotic 
group ranged in age from 20 to 42 years, with 
a mean age of 30.85, and the nonpsychotic 
group ranged in age from 21 to 43 years, with 
a mean age of 31.63. Thus, the Ss were rep- 
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resentative of the high school and college edu- 
cated hospitalized patients with functional 
mental disorders of recent exacerbation who 
might require differential diagnoses. 

Each S’s Bender-Gestalt protocol was scored 
by the Pascal and Suttell method without the 
scorer’s knowledge of the diagnosis, and the 
corresponding z score for the S’s educational 
Jevel was determined. For 37 Ss of these sam- 
ples, scores on the F and Critical Item scales 
of the MMPI were obtained. These two scales 
have been shown to be related to degree of 
psychopathology when applied to similar pa- 

tients (9), and they were used as indices of 
psychopathology. 


Results 


z score was found to be + .29, which is sig- 
nificant at the .05 level. The mean z score of 
the group of psychotics was 59.19, and of the 
group of neurotics and personality disorders 
| was 61.59. A ż test of the difference between 
the means yielded a value of 0.58, indicating 
no significant difference. Since the weights of 
the reproductive errors derived by Pascal and 
Suttell may not have been applicable to the 
samples used in this study, the numbers of 
raw errors produced by each group were com- 
pared. The mean number of raw errors of the 
| psychotic group was 7.44, and for the group 
| of neurotics and personality disorders it was 
9.44. A t of 1.74, however, was not signifi- 
cant at the .05 level. Since MMPI profiles of 
37 Ss were available, an attempt was made to 


The correlation coefficient between age and 


| determine if z scores were correlated with test 
] measures of psychopathology, if not with psy- 
chiatric diagnosis. Scores of both F and Criti- 

cal Item scales seemed to be suitable criteria 
| of degree of psychopathology since they each 


differentiated the two clinical groups at the 
| .05 level, based upon one-tailed hypotheses. 


| Accordingly, correlation coefficients were com- 
| puted for z and F and for z and the Critical 

Item scale. The obtained values of + .29 and 
| + .17 were not significant. 


tween the Bendér-Gestalt scores of these two 


Discussion 
The failure to find significant differences be- 
| clinical groups, when the important extrane- 
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ous variables of age, education, and chronicity 
were controlled, contrasts sharply with the 
positive findings reported by other investi- 
gators using the Pascal and Suttell scoring 
method. In comparing the scoring levels of 
this psychotic group, one finds lower raw 
scores and z scores than those reported by 
the other investigators, while for the nonpsy- 
chotic group the scores are more nearly simi- 
lar (1, 4, 7, 8). It may be of some significance 
in explaining the divergent findings of this 
study that none except Addington controlled 
for the age variable, nor did they uniformly 
control for education or chronicity. Adding- 
ton, who selected subjects with at least two 
years of hospitalization, obtained the highest 
raw score in his schizophrenic group. 

In conclusion, it appears that the Bender- 
Gestalt, scored by the Pascal and Suttell 
method, is of dubious effectiveness in differ- 
entiating between functional psychiatric dis- 
orders. In this study, it failed to separate hos- 
pitalized psychiatric patients with functional 
psychoses from hospitalized neurotics and per- 
sonality disorders, nor did it correlate signifi- 
cantly with MMPI-derived indices of psy- 
chopathology. 


Summary . 


The effectiveness of the Bender-Gestalt, 
scored by the Pascal and Suttell method, in 
differentiating the functional mental disor- 
ders, was investigated. The z scores were com- 
puted from the Bender-Gestalt protocols of 4 
group of 27 functional psychotics and a group 
of 27 neurotics and personality disorders 
matched on the basis of age. All Ss were se- 
lected from newly admitted or readmitted 
hospital patients who had at least ninth grade 
education. The findings showed no significant 
differences between the two clinical group 
and no significant correlations between % 
scores and two MMPI-derived indices © 
psychopathology. A significant correlation be- 
tween age and z score was obtained, contra! 
to the findings of Pascal and Suttell. It was 
concluded that the Bender-Gestalt, score, 
the Pascal and Suttell method, has dubious 
effectiveness as a differentially diagnostic ues 
strument for the functional mental disorders: 
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Rorschach Animal Responses and Intelligence 


Robert Sommer 
Southeast Louisiana Hospital 


Some writers have alluded to a negative 
relationship between the number of animal 
responses on the Rorschach and intelligence 
(1), but the evidence is equivocal as is the 
case with animal movement responses (3). 
The present paper summarizes the results 
from a comparison of Rorschach A, A%, and 
FM with Wechsler-Bellevue verbal scale score 
for 123 psychiatric patients at Southeast 
Louisiana Hospital, the total number of cases 
in the Psychology Department files for whom 
both records were available. The population 
consisted of 77 males and 46 females from all 
diagnostic categories. The Wechsler-Bellevue 

verbal scale was used as the IQ criterion, as 
it was felt that this would be less affected by 
conditions of depression, organicity, or se- 
nility than would total scale score. 

The Pearson coefficient between A and IQ 
is .27 + .08. However, it should be noted that 
previous research has found a relationship be- 
tween A and R (4) and between R and IQ 
(2). In the present study, these coefficients 
were .87 and .29, respectively. Hence, it 
seemed advisable to run a partial correlation 
between A and IQ, holding the effects of R 
constant. When this was done, the correlation 
between A and IQ was found to be .04. 

The other index of A responses is the A% 
which automatically takes R into account but, 
like most ratios, tends to be somewhat un- 
stable. The correlation between A% and IQ 
is .02. 


The correlation between FM and IQ proved 
to be .34 = .08. When the correlations be- 
tween FM and R (.68), and between R and 
IQ (.29) were taken into account, the partial 
correlation between FM and IQ was found to 
be .20 = .09 which is significant at the .05 


level and supports the results found by 
Tucker. 


Summary 


When the number of responses given by the 
subject was taken into account, there was n0 
over-all relationship between the number of 
animal responses and Wechsler-Bellevue ver- 
bal IQ for a psychiatric population. Howeve! 
there was a small but statistically significant 
positive relationship between animal move 
ment responses and. IQ. 
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New Tests 


Crawford, John E., & Dorothea M. Crawford Small 
Paris Dexterity Test. Manual (rev. 1956), pp. 12. 
New York: Psychological Corp., 1956. 

The Small Parts Dexterity Test is a performance 
test of fine eye-hand coordination, used for person- 
Nel selection and guidance. It involves tweezer dex- 
terity with small pins and collars, and starting and 
tightening small screws with a screwdriver. The re- 
vised manual gives percentile norms based on some 
thousands of employees, applicants, and students. 
There are data on reliabilities and on the relation- 
ships of the test to industrial criteria—L. F. S. 


Gough, Harrison G. California Psychological Inven- 
tory. High school-college-adult. 1 form. Untimed 
(45-60) min. Test booklet ($6.25 per 25, $21.75 per 
100) ; hand-scoring or IBM answer sheet, and pro- 
file ($3.75 per 50, $16.50 per 250) ; hand-scoring or 
IBM stencils ($3.00 per set); sample set ($1.00) ; 
manual, pp. 40 ($3.00). Palo Alto, Calif.: Consult- 
ing Psychologists Press, 1956, 1957. 


The California Psychological Inventory (CPI) bears 
considerable resemblance to the group MMPI, from 
which about 200 of its 468 items were adapted. But 
the purpose of the CPI is quite different. It is in- 
tended primarily for use with normal subjects, not 
Patients, and strives to assess personality character- 
istics important for social living. The 18 scales now 
available are divided into four groups: I. Measures 
of poise, ascendancy, and self-assurance (Do, domi- 
nance; Cs, capacity for status; Sy, sociability; Sp, 
Social presence; Sa, self-acceptance; Wb, sense of well- 
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being). II. Measures of socialization, maturity, and 
responsibility (Re, responsibility; So, socialization ; 
Sc, self-control; To, tolerance; Gi, good impression; 
Cm, communality). III. Measures of achievement po- 
tential and intellectual efficiency (Ac, achievement 
via conformance; Ai, achievement via independence; 
Ie, intellectual efficiency). IV. Measures of intellec- 
tual and interest modes (Py, psychological-minded- 
ness; Fx, flexibility; Fe, feminity). 

All but four of the scales were developed by item 
analyses against external criteria. So, for example, 
discriminates between delinquents and nondelinquents, 
and also has relationships with adjectival descriptions 
of normal persons. Four scales, Sp, Sa, Sc, and Fx, 
were formulated by content and refined by item 
analyses for internal consistency. The manual con- 
tains a wealth of information on the validities of 
scales and on the interpretations of single scales, in- 
teractions, and profiles. The matrices of scale inter- 
correlations, based on 4,098 males and 5,083 females, 
show relationships which vary from zero to the 
seventies. Retest reliabilities after one to three weeks 
cluster around .80; after a year, from .48 to .75. The 
problem of dissimulation is discussed extensively, and 
three scales, Gi, Wb, and Cm are shown to have spe- 
cial value in assessing it. Standard score norms are 
based on over 6,000 cases for males and 7,000 for fe- 
males; some 50,000 cases have already contributed to 
research on the instrument. The manual gives much 
other research data, and cites 54 references, 

By both objective and subjective evaluation, the 
CPI appears to be a major achievement. It will surely 
receive wide use for research and for practical appli- 
cations —L. F. S. 
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The Scaling of Terms Used to Describe Personality“ 


Arnold H. Buss 


Carter Memorial Hospital, Indianapolis 


and Herbert Gerjuoy 


Indiana University 


The terms used to describe personality have 
diverse origins. The majority of such words 
have been taken over from everyday speech 
and from literature. The various personality 
theories constitute a second source, and such 
terms as “castration anxiety,” “rich inner 
life,” and “approach-avoidance conflict” are 
associated with’ very different orientations to 
personality. Finally, our techniques of person- 
ality assessment have added another kind of 
jargon to the language: “coarcted,” “plus- 
getting,” etc. 

With three major sources of descriptive 
clinical language, it is not surprising that sev- 
eral different terms often have the same be- 
havioral referent. Even more confusing is the 
fact that one term may have several different 
behavioral referents. This chaotic state of af- 
fairs has led to considerable trouble when 
psychologists attempt to communicate their 
evaluations of personality to each other. The 
lack of communication is illustrated by a 
study that involved an attempt to predict fly- 
ing success by analyzing test protocols. In 
discussing the predictions the authors wrote: 


Because at least three different psychologists evalu- 
ated each cadet, the variation in personality descrip- 
tions from one clinician to the next in the kind and 
degree of personality descriptions makes a thorough, 
Systematic analysis of this kind impossible. Even 
when those cases on which there is unanimous agree- 
Ment as to predicted outcome are considered, there 
is little congruence of terminology, each clinician fa- 
voring concepts related to his own frame of refer- 
ence. A major problem in further analysis of data of 
this kind is finding an adequate system for classify- 
ing descriptive terms to make comparisons among 
Clinicians (5, p. 489). 

— 


1The writers wish to thank Dr. Irma Gerjuoy, 
Richard Mathias, and F. L. Mills for their help in 
collecting and analyzing the data of this study. 


If such a classification system were to be 
developed, which words would be included? It 
seems plausible to select those words that have 
the same meaning for psychologists of various 
orientations and eliminate those words that do 
not. If psychologists had available a set of 
terms whose meaning had been agreed upon, 
a large step would have been taken in the de- 
velopment of a scientific personality termi- 
nology. 

One method of assessing interpsychologist 
agreement would be to compile definitions of 
often-used clinical terms. This approach was 
adopted by Grayson and Tolman (3), who 
collected definitions of 50 words that appeared 
frequently in clinical reports. They found 
moderate agreement on some of the terms, 
but in summarizing their findings they con- 
cluded, “The most striking finding of the 
study is the looseness and ambiguity of the 
definitions of many of the terms” (3, p. 229), 
This research would seem to confirm the pre- 
scientific state of current Personality language, 

However, the Grayson-Tolman study may 
not be as damning of present-day terminology 
as it first appears. Most investigators of hu- 
man behavior have found wide disparities be- 
tween a subject’s behavior and his report of 
his behavior in the laboratory. Clinical psy- 
chologists might show considerable agreement 
if they were given a denotative or choice type 
of task rather than a literary type of task. 
For example, if the task were to evaluate the 
intensity of various words along some dimen- 
sion, psychologists might differ very little 
among themselyes. A physical analogy may 
serve to illustrate this point. Suppose several 
individuals are asked to define and to rank 
the words “hot,” “warm,” “tepid,” “cool,” 
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and “frigid” along a dimension of heat. Their 
definitions would probably be different, but 
their rankings would probably be quite simi- 
lar. Unfortunately, in the Grayson and Tol- 
man study the subjects only defined terms 
and did not order them along dimensions. It 
seems likely that such ordering would have 
yielded considerably more agreement than did 
defining the terms. Furthermore, in most clini- 
cal situations the psychologist’s denotative be- 
havior (identifying a patient as “agitated”) 
is more important than his literary skill (de- 
fining the term “agitated”). Therefore, a scal- 
ing procedure was chosen in an attempt to 
discover which personality words would be 
teliably scaled by clinical psychologists. 
Although the choice of personality dimen- 
sions had to be arbitrary, it was felt that the 
personality variables chosen would be mean- 
ingful to most psychologists who study per- 
sonality. Factors influencing the choice were: 
(a) previous classification systems, e.g., Mur- 
ray (7), Cattell (1); (b) importance to clini- 
cal psychologists as indicated by their reports 
on patients; (c) minimum overlap among di- 
mensions; and (d) availability of words that 
would cover the full spectrum of a dimension. 
A full discussion of the issues involved in clas- 
sifying behavior and selecting response units 
is beyond the scope of this paper. The scaling 
of terms does not stand or fall on the particu- 
lar dimensions of personality chosen. How- 
ever, in defense of the dimensions used here, 
it should be noted that they differ little from 
those used in most classification schemata in 
the area of personality. Furthermore, “feed- 
back” from clinical colleagues who were pre- 
sented with these dimensions revealed ready 
acceptance of them. 


Method 


The initial step involved selecting 18 di- 
mensions of personality along which to order 
descriptive terms: Verbal Hostility, Physical 
Hostility, Hostile Attitudes, Physical Sexual- 
ity, Sexual Attitudes, Anxiety, Mood, Guilt, 
Self-Esteem, Ideation, Impulsiveness, Guard- 
edness, Flexibility, Emotional Warmth, So- 
ciability, Autonomy, Dominance, and Ambi- 
tion. Next, 18 sets of adjectives were compiled, 
one set for each of the personality dimensions. 
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Terms were selected which, in the opinion of 
the writers, are well known to most psycholo- 
gists, and an attempt was made to avoid the 
pitfalls of both slang and pedantry. Words 
were chosen that had specific rather than gen- 
eral referents, e.g., “solemn,” “brooding,” and 
“hopeless” were selected for the Mood dimen- 
sion rather than “depressed,” which is gen- 
eral enough to include all of the other three 
adjectives. 

Each dimension was composed of a series of 
words ranging from too much of the person- 
ality attribute to too little. For example, the 
dimension of Self-Esteem ranged from “self- 
exalting” through “confident” to “self-depre- 
catory.” 

It was a simple matter to assign words to 
most of the dimensions, but in constituting 
several dimensions the tendency of some ad- 
Jectives to connote more than one aspect of 
behavior proved to be troublesome. An ex- 
ample of multiple connotations in physical 
dimensions will illustrate the difficulty: the 
word “light” refers to both the dimension of 
illumination and the dimension of weight: 
Similarly, the group of words “panicky,” “tet 
rified,” “agitated,” and “tremulous” fall at 
one end of the dimension of Anxiety, whose 
other extreme may be described by stolid” 
and “phlegmatic” on the one hand, or “ovel 
confident,” “sure,” and “secure” on the othe" 
hand. The writers have found it useful t° 
refer to this case as a Y-shaped dimensio? 
Such Y-shaped dimensions occurred in thé 
areas of Physical Hostility, Sexuality, Am: 
bition, Anxiety, and Flexibility. In these 12° 
stances, including both possible lists of words 
would have necessitated repeating part of 00° 
list. Therefore, only one list was chosen f0 
each of these five dimensions, the list that 
seemed to cover a wider range of behaviors ° 
interest to clinical psychologists.” 


Two scalings of these adjectives were per 
formed, 


? The writers’ opinion was not the sole determin? 
of inclusion of words or composition of the cine 
sions. The results of a pilot study were also con 
ered. In this study, 12 psychologists rated words ere 
16 of the 18 final dimensions. Modifications wd 
made where there was poor interjudge agreements 9 at 
suggestions by the judges were taken into acc® 
when compiling the final list of adjectives. 


EEE SSSSSe 


Scaling of Terms Used to Describe Personality 


Intensity Scaling 


-~ There were 42 judges* who were clinical 
Psychologists with at least an MA or equiva- 
lent. and one year of experience. Half the 
judges were recruited locally, and the other 
half were distributed throughout the country. 
The Sroup manifested a good deal of diversity 

in training, theoretical orientation, interests, 
} and age. This heterogeneity suggests that sys- 
tematic bias in rank ordering the adjectives 
Was minimal. 

The judges were given the following in- 
structions: 


Enclosed are 18 packets of adjectives, each packet 
eing a dimension of personality. The top card names 
the dimension, and the adjectives are below it. The 
idea is to rank the adjectives according to where you 
think they belong on the continuum. The continua 
are set up so that abnormality is at both extremes, 
With normality somewhere in the middle. This differs 
» from the usual continuum of good to poor adjust- 
f ment. For example, on the Autonomy continuum the 
| word Assured should be somewhere in the middle, 
Defiant somewhere near the top (too much Au- 
tonomy), and Clinging somewhere near the bottom 
(too little Autonomy). Thus, each dimension ranges 
from too much of the variable through a moderate 
| degree to too little of it, Many of the words have 
| connotations that have relevance for several dimen- 
sions; try to rank them with respect to only the 
other words on the dimension, 


| The rankings were converted into scale 
values by means of Hull’s method (4), which 
assumes a normal distribution underlying the 
obtained ranks. Each ranking was converted 
into a percent position, which was translated 
into a scale value along the base line of a nor- 
mal distribution curve. Then the scale values 
of all the judges were averaged to yield a 
mean scale value for each word. The stand- 
ard deviation of the normalized rankings of 
€ach word was also computed. 
The Intensity scaling also involved abnor- 
ality, since the judges were instructed that 
abnormality corresponded to too much or too 
ittle of the particular variable. However, it 
| Was not known whether the judges followed 
" 


l 
the instructions, and the relationship between 
intensity and abnormality remained unknown. 


8 The writers wish to thank these 42 psychologists, 
as well as those who served as judges in the pilot 
Study and those in the second scaling operation. This 

Udy could not have been carried out without their 
considerable efforts, 
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Therefore, the words were scaled again, this 
time for abnormality. 


Abnormality Scaling 


There were 24 judges, all recruited locally, 
and the same minimum qualifications for 
judges were used as in the first scaling. Twelve 
of these judges had also Participated in the 
first (Intensity) scaling, but seven months 
elapsed between scalings. Again there was con- 
siderable heterogeneity in training and orien- 
tation of judges. 

A rating scale method was used, and judges 
were given the following instructions: 


The purpose of this study is to scale words on the 
dimension of abnormality. This dimension may be 
thought of in terms of a continuum of numbers that 
ranges from Normality at 1 to Abnormality at 9, 
Your task is to judge each word with respect to this 
dimension of abnormality. Simply assign a number 
between 1 (most normal) and 9 (most abnormal) 
to each word. The task does not involve ranking the 
words but only assigning each one to a Point on the 
1 to 9 abnormality continuum, There are 18 sets of 
words, each set covering an area of Personality, 
Within each set the words are arranged in alpha- 
betical order. Use each number (1 to 9) as often as 
you need. 


Average scale values were obtained for each 
adjective directly by computing the arithmetic 
mean of the numbers assigned to it by the 
various judges.* Both the method of scaling 
and the kind of dimension (abnormality at 
the opposite extreme from normality) were 
different than they were for the Intensity 
scaling. It may be argued that the two dif- 
ferent instructions and scaling procedures con- 
stitute a check on the stability of the scale 
values, since in scaling for Intensity both 
the instructions and the scaling method are 
sources of variance, Scale values should tran- 
scend the particular instructions and scaling 


*The raw data were amenable to scaling by the 
method of successive intervals (2). This method as- 


ing represent equal intervals in the Psychological con- 
tinuum. With well-trained raters (psychologists) it 


Procedures that differed as much as-possible, to pro- 
of the 


364 


method used; otherwise, their applicability 
would be extremely limited. A high relation- 
ship between the Intensity scale values and 
the Abnormality scale values, despite the pro- 
cedural differences involved in scaling, would 
suggest that the scale values have some gen- 
erality. 


Scale Values 


The results of the two scalings are presented 
in Table 1. Low Intensity means correspond 
to an insufficiency of the personality variable, 
and high Intensity means correspond to an 
excess. The Abnormality means are directly 
related to abnormality: the larger the mean, 
the greater the degree of abnormality. Within 
each personality dimension the words are 
listed in order of their Intensity means, from 
lowest to highest. Inspection of Table 1 re- 
veals that as the Intensity means increase in 
magnitude, the Abnormality means proceed 
from high values to low values to high values 
again. This relationship is best illustrated by 
a plot of the Abnormality means ys, the In- 
tensity means, and Fig. 1 shows such a plot 
for the dimension of Impulsiveness. The two 
segments of the curve form a rough V. The 
left (descending) Segment represents words 
with low to moderate Intensity values; here 
the Abnormality values descend from the ex- 
treme toward minimal Abnormality. The right 
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Fig. 1. A plot of abnormality means versus intensity 
means for the impulsiveness dimension. 
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(ascending) segment represents adjectives 
with moderate to high Intensity values; here 
the Abnormality means ascend from minim: 
Abnormality toward the extreme. The apex of 
the V corresponds to the word with the mini- 
mal Abnormality score. 

In determining the strength of the relation- 
ship between the Intensity and Abnormality 
means, it was decided to break the curves at 
the apex of the V into two segments, one for 
the descending and one for the ascending ar™ 
of the V. Within each segment a rank-ordel 
correlation between the Intensity and Abnor 
mality means was computed.® Thus, two col 
relations were obtained for each personality 
dimension, one for each portion of the V 
shaped curve. These two correlations were aV 
eraged, weighting each correlation by th® 
number of words in its portion of the Y 
shaped curve. This average was taken as thé 
measure of the over-all strength of the rei 
tionship between the Abnormality and the mi 
tensity means in each dimension. 

These average rank-order correlation 
ranged from .75 to .97, indicating a stro”? 
relationship between Intensity and Abnormā, 
ity scale values for all the dimensions. We, 
dently, the judges did consider abnormality) 
when scaling for Intensity. Thus, the s¢ 
values appear to have considerable generalit)” 
| 
Reliability 


Reliability coefficients of the mean scal 
values, 72+, were computed for each dimensi? 7 
All the Intensity reliability coefficients W? 
above .99, and the Abnormality reliabilit 
ranged from .93 to .98. st 

The SD’s in Table 1 are related to the 
reliability coefficients in the following wit 
for the Intensity scaling the higher the 1e y 
bility, the lower the average variance. The pat 
tremely high reliability coefficients mean 4 f 
the obtained SD’s are extremely low. 1 

The Abnormality SD’s shown in Table y 
are slightly larger than the Intensity >A 


: pe 
There are three reasons for this discrep? L 


o 
5 In order to avoid inflating the obtained cort sol 
tion coefficients, the pair of mean scale value? of 
that adjective with minimal Abnormality sC 
any given dimension was omitted from the CO™ 
tions. 


ore y 


es 
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Table 1 
Mean Scale Values and SD’s for the Intensity and Abnormality Scalings 
Intensity Abnormality Intensity Abnormality 
ee 
Mean SD Mean SD Mean SD Mean SD 
Verbal Hostility Physical Sexuality 
1. venomous 1.7 .63 7.7 2.08 1. assaultive 1.8 53 84 1.11 
2. abusive 2.3 54 6.9 1.50 2. wanton "28 al 7.6 1.68 
3. threatening 2.6 .92 6.6 1.95 3. over-active 3.6 49 S7 1.65 
4. derisive 34 153 5.8 194 4. soliciting 38 85 62 135 
5, derogatory 3.5 .51 5.5 1.98 5. passionate 3.9 Erd 2.8 1.68 
6. scornful 3.7 59 55 171 6. assertive 43 156 3.2 1.59 
7. sarcastic 4.1 43 5.4 1.61 7. amorous 4.6 42 2.6 1.70 
8. argumentative 4.4 53 48 1.31 8. permissive 5.1 44 29 1.86 
9. over-critical 45 35 50 170 9. hesitant 58 36 43 164 
10. nagging 4.6 31 5.5 1.71 10. passive 6.0 -60 48 1.84 
11. outspoken Si +24 3.1 1.44 11. restrained 6.4 63 4.1 1.76 
12, frank 5.5 20-25 138 12. inhibited 68 41 5.6 1.49 
13. tactful By AS de Ñi | 13. apadene 73 87 6&8 155 
14 soft-spoken 60 38 28 148 14. abstinent 81 62 6.6 200 
ig complimentary 63 24 31 Pes | Seal Attitudes 
. praising 67 37 3.8 1.76 
17. flattering 70 483 5.0 1.44 1. lewd 2.5 1.18 71 2.04 
18. mealy-mouthed 72 1.51 5.9 201 2. lustful 2.5 -70 6.0 2.01 
19. apple-polishing 7.6 66 6.0 1.51 3. erotic 3.6 54 3.6 2.14 
20. eulogistic 8.1 88 5.5 1.85 5 sensual 3.8 43 3.2 Tsi 
Physical Hostility a uaa so | BB t 
il Murderous 17 18 88 64 7, prim 57 34 54 180 
2. assaultive 3.2 37 8.0 1.34 8. prudish 6.3 66 6.0 1.84 
3. destructive 34 52 70 167 9. ashamed 66 81 §9 173 
4. combative 3.9 36 6.6 1.63 10. puritanical 7.0 .87 7.1 1.78 
5. hot-blooded 44 18 5.3 1.67 11. disgusted 78 7A 73 146 
6. even-tempered 5.1.26 1.5 91 i i j à 
7. peaceable 5.7 34 2.1 1.09 | Anxiety 
8. harmless 61 71 4.1 1.79 1. terrified 1.8 54 84 1.63 
9. inhibited 64 72 4.7 1.60 2. panicky 22 54 7.8 1.29 
10. placating 6.9 53 42 1.60 3. agitated 2.9 53 70 147 
11. cringing 8.2 AL 7i 1.61 E tremulous 3.5 AL 69 1.74 
Hostile Attitudes TEn 138 
Lalie 16 60 69 172 7. fretful 43 30 54 135 
2. embittered 2.8 86 58 1.74 8. uneasy 46 26 42 1.75 
3. quarrelsome 2.9 80 58 1.63 9. composed 53) 44 1.5 71 
4. saly 30 80 66 1.80 10. calm 5.5 153 14 49 
5. provocative 3.5 96 59 1.68 11. nonchalant 60 61 3.1 1.59 
6, Cesena] 38 59 5.6 1.82 12. unconcerned 61 45 41 182 
7. irritable ae 0, Ag aai aa 61 94 93 134 
8. grouchy o M e on A bia 67 93 as iss 
9. petulant 43 74 Š6 152 15. stolid 7.2 89 5.2 1.61 
10. grudging 4.3 62 52 1.58 16. imperturbable 7.3 88 3.8 1.58 
AN, eale 5.4 38 24 1.19 17. phlegmatic Mal: 90 5.4 1.89 
12. inoffensive 5.7 53 3.6 1.44 : ó i 
13. unresentful 5.8 -62 36 173 Mood 
14. agreeable 6.0 -68 1.6 -90 1. euphoric 1.3 38 6. 
iE enue 64 64 23 118 2. elated am ff E 
16; eritious 6.5  .76 2.2 1.20 3. frivolous 28 73 51 155 
17. conciliatory 66 65 30 131 4. buoyant 32 60 BY 4.54 
18. ingratiating 73 81 5.0 183 5. gay 34 149 27 121 
19. oily 77 105 65 163 6. jovial 355 AL Z1 Tn 
20. fawning 78 109 6l 163 7. light-hearted 40 31 24 132 
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Table 1—Continued 


Intensity Abnormality Intensity Abnormality 
— — > EUERE da 
Mean SD Mean SD Mean SD Mean SD 
8. cheerful 4.1 27 14 63 5. musing 4.3 A4 3.0 1.46 
9. placid 4.5 10 3.3 1.95 6. contemplative 4.4 64 a4 Way 
10. sober 49 20 3.4 1.65 7. thoughtful 49 35 1.6 1.04 
11. serious 5.0 25 3.2 1.61 8. matter-of-fact 5.6 33 3.4 1.58 
12. solemn 5.4 24 45 1.89 9. literal 5.9 64 42 113 
13. mirthless 5.5 64 58 1.71 10. unreflective 6.4 56 5.5 1.61 
14. grave aa 2 3 1.93 11. unimaginative 6.9 58 55 1.66 
15. gloomy i . i 1.63 12. stolid 81 
16. brooding 6.7 -70 5.8 1.64 13. vacuous oe a ae ra 
17. dejected 70 52 5.7 1.54 : i i : 
18. disconsolate 7.1 68 6.9 1.55 | Impulsiveness 
19. despondent 7.6 65 ZO; 15% 1. incontinent 21 Toe 78 131 
20. hopeless 8.4 68 7.9 1.26 2. reckless 2.2 83 70 1.78 
Guilt 4 ie 27 62 63 re 
a - us ý F $ i 
1. self-condemning 1.6 49 74 1,55 5. excitable 3.6 3 z 1 1.99 
2. self-reproachful 2.7 .47 EE MART 6. hasty i : i 4.70 
3. remorseful 28 75 42 248 7, ab 32 80 50. Gi 
4. ashamed 33 44 34 18 ae ep! ai s as 1 
5. chagrined 37 S a7 ia 9. mobile a 2 2 i 
6. regretful 39 27 35 200 | 410. spont 7, = © ae 
7. concerned 44 10 20 143 a spontaneous 48 40 18 1L 7 
8. indiff a r - self-possessed 55 64 21 1.6 
S Ea z .52 59: 1.71 12. cool-headed 57 53 20 102 
- ng 6 91 i wa . 
10. unreformed ee | Yee s9 a 3.0 oli 
11. cynical 58 78 Bie ie 14. controlled 60 61 2.8 t 
12. unrepentant 59 58 ES ise A restrained 6.6 36 4.2 L 
13. hardened SR pg OP a oaa 66 a 43 15i 
14. shameless 6.5 69 67 146 is: oye cautious 7.3 64 5.5 139 
15. conscienceless 70 95 82 88 19. retarded 78 87 6.8 io 
16. unscrupulous AT ~ 85 72 169 - sluggish 78 79 58 1. 
17. incorrigible IT 85 8.0 1,58 | Guardedness 
Self-Esteem 1. uncommunicative 1.9 57 65 218 
r 2. secretive 2.1 71 6.5 1.35 
2, on 26 86 ve et 3. taciturn 34 62 49 20 
3. conceited 34 57 54 177 S lose 36 61 44 1i 
. icent 3.8 58 4.6 ‘ 
4. boastful 3.4 63 49 1.71 6 x 1.73 
© vain 3.5 58 58 186 z r ar o mieta 40 66 Be 131 
E z “ + reserve 4.3 48 KS . 
6. cocky 3.6 85 4.7 186 8. di 1.06 
7. confident 46 12 13.69 s Sach 46 45 20 ie 
; . ghtforward 5.6 .62 2.3 4 
8. self-respecting 4.7 13 1.3 69 10. candid 59 77 2.7 1.9% 
9. modest 53 22 24 1.44 11. : o TA 
10. unassuming 53 23 24. 4199) 12; alia AY a AA 2.10 
11. humble 59 32 40 1.79 13. talkative 64 46 38 16 
12. self-doubting 64 45 5.9 1.55 14. voluble 73 93 57 21 
13. self-effacing 6.8 66 62 171 15. verbose 73 77 5.7 185 
14. self-deprecatory 7.4 71 73 174 16. garrulous 76 1.00 54 t 
15. forlorn “7.5 96 69 1.74 Flexibilj 
16, self-abasing 77° M Ta 1.80 a uty 115 
7 1. spineless 18 aio 73 {M 
Ideation 2. over-suggestible 2.6 253: 6.3 62 
1. delusional 1.8 oA ee is : Bee 35 54 43 1 
2. ruminative ES M E ` . docile 26 93 5: y 
3. day-dreaming 3.4 52 48 1.71 5. changeable 37 153 3.5 i 
4. fanciful 3.6 Si 45 191 6. amenable 44 139 2.0 
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Table 1—Continued 


Intensity Abnormality Intensity Abnormality 
Mean SD Mean SD Mean SD Mean SD 
7. adaptable 45 44 12 es? 4. contrary 36 353 5.5 1.78 
8. conventional 4.9 31 40 1.37 5. resistant 3.7 .60 4.5 1.47 
©; ‘Hersistent 55 37 40 171 6. wilful 37 63 4.7 175 
10.: habit-bound 5.9 -60 6.0 131 7. dissenting e S -60 46 1.71 
11. stubborn 6.3 57 5.1 1.86 8. self-reliant 4.6 sot 15), 21:00 
12. perseverating 67 111 7.0 1.54 9. assured 46 57 15 1.08 
13. unbending 67 63 6.0 1.59 10. resigned SS GT. Sa 212 
14. mulish 71 101 71 14 11. weak 59273 61 1.44 
15. intractible 79.78 72 141 12. quitting 60 110 61 183 
Enea arn fo 
1. over-indulgent BO ets 5.8 1.58 15. pleading 6.7 93 5.9 1.86 
2 doting 2a A res a 16. whining 6.9 67 7.0 1.81 
. affectionate 5 : : g 17. clingi f J 
ae: io} a 2 ae Sis, a6 7. clinging 74 1.00 64 1.80 
: sei amp oe Hs a1, 166 18. helpless 7.8 114 7.8 ~ 1,74) 
. tender E . 7 : 
6. sympathetic 4.2 58 24 1.15 Dominance 
7. kindly 4.6 36 1.7 90 1. dictatorial 1.6 AL 737 G52 
8. considerate 4.8 Ži He ie 2. autocratic 24 43 67 1.27 
9. cool 5.8 6 L 3. high-handed S atl 
10. unresponsive 5.9 60 6.7 1.77 4. nefera ‘ 34 30 2B sith 
11. detached 6.0 8 3 Be fae 5. forceful 37 37 44 TE 
K angealine oe 5, E 6. assertive 40 42 25 1.38 
- harden - -l X z 7. decisive 4.2 .24 24 1 
14. rejecting Vie 405° #2. 129 i Se 
is. foad 78 79 79 112 8. cooperative 4.8 32 18 1.09 
. Ingu s i; : z 9. assenting 5.3 42 41 1.53 
Sociability 10. conforming 5.5 52 4.2 1.77 
if intrusive 20 .86 55 alii 11. compliant 59, G55) 44 1.49 
3. meddlesome 2.1 97 6.3 1.82 12. acquiescent 5.9 65 EI ATT, 
3, gregarious 2.9 67 21 1.17 13. imitative 6.2 82 5.4 1.53 
A convivial 33 58 1.9 95 14. deferent 6.4 .76 48 1.61 
ee hae 34 WS 29 141 15. timid 69 Al 5.6 1.84 
Si 
radel: 38 37 i7 &85 16. meek 73° (59 5.6 1.87 
6. com: y 17. l 
7. companionable 4.0  .38 15 64 + servile 85 34 7.5 1,53 
8. agreeable 44 25 17 834 | Ambition 
9. accessible 48.72 23 1.38 i 
IOs Ra -52 ‘29 42 147 í: grandiose LS -00 7.5% “HSS 
tt ceca 53 ‘47 Be 44-66 2i pretentious 2.7 .58 595 1,72 
ron Eer 56 32 4.3 1,70 3. aspiring 3.5 1 1.8 76 
aS feet 56 39 52 1.75 4, enterprising 3.7 47 18 1.26 
ip hae 59 7 54 1.75 5. persistent 3.8 5 29. 144 
15. shrinking 65 53 70 157 he r 2 eee 
Tege 72 65 Vile WES if content 49 .24 2.5) 71.58) 
Tee 75 "5 83 1.25 8. self-satisfied 4.9 25 35 1.71 
4 ities 75 `S0 75 1.66 i fe 5.3 33 3.9 1.66 
. $ ‘ 73 70 194 s adaisica! 6.0 40 5.9 4 
19. isolated fe 11. indifferent Gt 67h ese en 
Autonomy 12. listless 6.6 74 6.0 1,92 
1. rebellious 1.8 pl 6.5 1.50 13. indolent Pas 96 64 1.87 
2. defiant 21 63 6.3 1,84 14. apathetic Te 1,00) 6.8 1.98 
3. negativistic 3.0 69 6.5 1.76 15. lethargic 16 = 82 6 L71 


First, the Abnormality reliabilities were lower. the Abnormality SD’s were inflated by differ- 
Second, the number of judges on which the ences among judges in average rating, whereas 
Scale values were based was smaller. Third, the Intensity SD’s could not be affected by 
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this variable because the rank-order method 
was used. 


Discussion 


The results clearly indicate that clinical 
psychologists can order descriptive terms con- 
sistently. In both the Intensity and Abnormal- 
ity scalings the vast majority of personality 
terms were ordered with relatively low inter- 
judge variability. It may be concluded that 
when the task is denotative and the personal- 
ity dimensions are specified, many words in 
current usage have behavioral referents clear 
enough for judges to agree in scaling them. 

These findings may be extrapolated to the 
written reports of clinical psychologists. The 
clinician’s predilection for using terms dear 
to his theoretical or test orientation must be 
assessed in the light of the communication 

value of such terms. Words that are scaled 
less consistently should be omitted, and words 
with lower interjudge variability should be 
included in psychological evaluations. In ad- 
dition, the dimension of personality should be 
specified when descriptive terms are used, For 
example, the word “cool” may refer to a rela- 
tive lack of anxiety or to a relative lack of 
emotional warmth; but when the dimension 
is specified, the correct connotations are com- 
municated. When the personality dimension is 
not stated, communication often breaks down. 
In addition to suggesting which terms to 
use in evaluations of personality, the present 
classification system may be a useful tool for 
personality research. It might be utilized in 
an assessment study of the Holtzman and Sells 
type (5). Words could be selected on the basis 
of Intensity or Abnormality. Clinicians usually 
prefer to use words rather than numbers in 
characterizing personality. With the present 
classification system they could choose words 
to describe personality, whose scale values for 
Intensity and Abnormality had already been 
determined. The question of reliability among 
clinicians in assessing personality (a thorny 
issue in the Holtzman and Sells study) could 
be settled easily, since all clinicians would 
choose words from the same lists and the dif- 
ferences among clinicians could be determined 
quantitatively. 

An investigator planning to use the Inten- 

sity scale values would probably not need the 
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entire list of words on any dimension. The 
following suggestion may prove helpful in se- 
lecting words. If two words have similar of 
identical scale values, select the word with the 
smaller Intensity standard deviation. If there 
is still doubt, use the Abnormality standard 
deviations to decide which word to include. 
Similarly, if the investigator is primarily in- 
terested in the Abnormality scale values, he 
would use the Abnormality standard devia- 
tions first in breaking ties in Abnormality 
mean scale values. 

The classification system can be used as 4 
self-rating device. Zuckerman ef al. (8) use 
the data of the pilot study to select terms 
along 16 dimensions of personality. They ha! 
patients and normals describe themselves, 
their ideal, their parents, and others, and thé 
various discrepancies between self and idea”) 
self and parents, etc., were investigated. What 
distinguished their study from most similat 
studies was that they were able to specify 1" 
which areas of personality there were large 
discrepancies. 

Mills è used the present data to investigat 
personality factors in judging the behavior ° 
others. Both the judges and the individu 
als who were judged assessed themselves by 


checking one word on each of the 18 person” | 


ality dimensions. Both Mills and Zuckerma? 
et al. selected words on the basis of Intensit 
scale values, spacing terms so as to cover 
whole dimension in more or less equal inte! 
vals. The more difficult terms had to be elit 
nated so that the vocabulary level did 2° 
exceed that of the noncollege population. 
both investigations, the time taken by SU” 
Jects to complete the assessment of self an of 
others was considerably less than that 1% 
quired by a Q sort. 

The present study has implications that 8° 
beyond the possible applications of the cla 
sification system. The language of clinical psy” 
chology is largely prescientific and imprecis? 
In the search for a precise and objective t% 
minology, it would seem that psychologi® 
scaling methods have much to offer. 


, Summary 
In this study, adjectives were scaled & 
18 dimensions of personality, using ta 


Jon% 
-0% 


° Unpublished study. 


der and rating-scale methods. The terms were 2. 


scaled for Intensity (too much to too little of 
the variable) and for Abnormality (minimal 
to maximal abnormality); there were 42 and 
24 judges, respectively, all clinical psycholo- 
gists. The mean scale values proved to be re- 
liable, and possible applications of them were 
mentioned. It was suggested that psychologi- 
cal scaling methods are of service in improv- 
ing personality terminology. 


Received December 21, 1956. 
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What Is Measured by the “Cannot Say” Scale 
of the Group MMPI?’ 


Arthur S. Tamkin and Isidor W. Scherer 


Veterans Administration Hospital, Northampton, Massachusetts 


While a large fund of information is avail- 
able about the various MMPI clinical and va- 
lidity scales and the personality attributes 
which they purport to measure, relatively 
little has been written about the “Cannot 
Say” scale. The purpose of this study is to 
determine if the “Cannot Say” scale of the 
group form is the reflection of a defensive or 
evasive attitude on the part of the examinee, 
as suggested by Brown (1), and if it is re- 
lated to the symptoms of psychasthenia and 
depression in psychiatric patients, as observed 
by the test authors (2). 

Completed successively administered MMPI 
answer sheets were obtained from 126 male 
psychiatric patients who were newly ad- 
mitted or readmitted for hospital treatment. 
They were scored for the “Cannot Say” scale, 
the three validity scales, and the nine clini- 
cal scales. The frequency distribution of the 
“Cannot Say” scores was described by a dis- 
continuous curve, the major portion of which 
showed marked positive skewness. The scores 
ranged from zero to 145, the mode being zero 
and the median one. The discontinuous por- 
tion of the curve was made up of 12 extreme 
cases whose scores ranged from 26 to 145. 

In order to explore the hypothesis that 
“Cannot Say” scores may be related to defen- 
sive attitudes or to the symptoms of psychas- 


1 An extended report of this study may be obtained 
without charge from Arthur S. Tamkin, Veterans Ad- 
ministration Hospital, Northampton, Mass., or for a 
fee from the American Documentation Institute. Or- 
der Document No. 5273, remitting $1.25 for micro- 
film or $1.25 for photocopies. 


thenia and depression, the relationship ! 
tween “Cannot Say” and each of the following 
scales was evaluated by chi-square tests: ` 
L, F, Pt, and D, None of the chi squares 
reached significance at the .05 level or bettet | 


remainder of the MMPI. When the hypothes” 
that the 12 extreme cases omitted a disp 
portionately larger number of items On pa 
ticular psychiatric scales was tested, only ° 
scale, Ma, reached the .05 level of confident 
It was concluded that for the group form a 
the MMPI there is no relationship betwe 
high “Cannot Say” scores and MMPI men 
ures of depression and psychasthenia, anê -je 
high “Cannot Say” scores, as a general 1 
do not seem to represent a defensive, 6745" 
attitude on the part of the psychiatric patie 
This, however, does not rule out the possib 
ity that certain individual patients may U4 iy 
omissions as an evasive measure, but mer? 
makes untenable the general applicatio” 
such an hypothesis. 


es = 3 


Brief Report. 
Received March 15, 1957. 
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Heterosexual Somatic Preference and 
Fantasy Dependency 


Alvin Scodel 


The Okio State University 


With the exception of occasional statements 
in the psychoanalytic journals, almost no at- 
tention has been given to the female body- 
types considered desirable by adult males, and 
the psychological correlates of such prefer- 
ences. It is, indeed, rather curious in light of 
the numerous attempts to view all behavior 
and attitudes from Freudian assumptions that 
a topic which so preempts the conversation 
and fantasies of adult males should have been 
so completely ignored for systematic psycho- 
logical investigation. 

When somatic preferences are discussed in 
the psychological literature, either in case 
studies or as part of an exposition of a par- 
ticular theoretical position, inconsistent con- 
clusions are often reached. Tridon (8) asserts 
that men nursed at the breast in infancy are 
attracted in adulthood to women with well 
developed breasts, and that men who have 
been fed by bottle prefer thin, boyish looking 
girls. Breast feeding, presumably a more grati- 
fying experience than bottle feeding, leads to 
a later preference for women with well devel- 
oped breasts who can continue to supply the 
oral gratifications of infancy. On the other 
hand, Gorer (4) states that the fetish that 
exists for breasts in American culture is a 
derivative of scheduled feeding in infancy. In 
this view, then, a preference for large breasted 
females is a result of oral frustration rather 
than oral satisfaction. 

Despite such differences, there is little dis- 
agreement about the importance of the breast 
to the concept of orality. In Freud’s opinion 
(2), the child at the breast formed the proto- 
type of all future love relationships. Ilustra- 
tive clinical illustrations can be found in cases 
described by Gero (3), and Levey (6). Gero’s 
patient had been separated from his mistress, 


and it is stated that “his longing for this 
woman was not a man’s longing for a lost 
love, but a child’s longing for his mother. . . . 
Whenever he thought of his beloved, the aim 
of his yearning, the lost paradise in his jm- 
agination was always the breast” (3, p. 448). 
Levey discussed an ulcer patient who “fan- 
tasies the earth, the air, and the universe as 
composed of breasts, himself floating in a sea 
of breasts, and the Capitol at Washington as 
having a breast pinned on it” (6, p. 163). It 
seems to be clearly implied that the orally 
fixated person is preoccupied with the ample 
well developed breast rather than the small 
breast. By way of a casually stated ipse dixit 
common clinical parlance also speaks of large 
breast preference as a product of oral fixation. 
The present investigation is concerned with 
the relationship of breast size preference to an 
allegedly important aspect of orality; namely. 
dependency. The phrases “oral passivity” and 
“oral dependence” are so frequent in both the 
psychological and psychoanalytic literature 
that no documentation is necessary. The in- 
strument used to measure dependency in the 
present study is the TAT. To be more pre- 
cise, fantasy rather than overt dependency is 
being measured, although there is evidence to 
indicate that fantasy dependency as measured 
by the TAT correlates significantly with overt 
epadan ae overt dependency is meas- 
ured by conformity to erron j 
inte (8). y eous group judg- 


Procedure 


Five male graduate students rated 101 full- 
length pictures of nude females (obtained 
from photography and art magazines) along 
a dimension of breast size. The following rat- 
ing scale was used: 


371 


372 


. Considerably above average breast size. © 
. Slightly above average breast size. 

. Average breast size. 

. Slightly below average breast size. 

- Considerably below average breast size. 


Ap wd E 


By summing the judges’ ratings, a range of 
judgments for all pictures was obtained which 
extended from five to 25. Those pictures with 
a total rating of 11 or less were defined as 
large breasted (LB), those with a total rating 
of 19 or more as small breasted (SB). Sixty- 
five of the pictures met these criteria. Of 
these 65, 32 were classified as LB, and 33 
as SB. Sixty-eight per cent agreement was 
achieved by the judges in their ratings of the 
65 pictures. Ninety per cent agreement was 
obtained by combining Categories 1 and 2 
and Categories 4 and 5. 

In order to obtain a higher probability that 
an experimental subject would select between 
two females on the basis of breast size, it was 
necessary to equate the pictures as much as 
possible in general attractiveness, Otherwise 

a subject who usually preferred LB females 
might on occasion select an SB female be- 
cause of her greater general appeal. Similarly 
the pairing of an attractive LB female with 
an unattractice SB female could lead a sub- 
ject to select the former although he gener- 
ally preferred SB females. To minimize the 
effect of facial attractiveness, masking tape 
was placed over the face on each of the pic- 
tures. Five additional judges then rated each 
picture on a six-point scale ranging from 1 
(exceptionally attractive) to 6 (exceptionally 
unattractive). The range of judgments was 
eight to 23. The purpose of obtaining these 
ratings at this time was to determine whether 
a sufficiently large number of pictures classi- 
fied as LB and SB would fall within a similar 


Table 1 


Distribution of Judges’ Ratings on Attractiveness 


Sum of Five 
Judges’ Number Number 
Ratings LE . SB 
8-12 13 0 
13-20 17 19 
21-23 2 14 
32 33 
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range of attractiveness in order to permit 
pairing. The distribution of ratings on attrac- 
tiveness is presented in Table 1. 

It is apparent that LB females were rated 
as more attractive. The next step consisted of 
having slides made out of the pictures. Forty- 
nine of the pictures were converted into black 
and white 2 x 2 slides with the facial area 
still masked out. Since conversion of the pic- 
tures into slides could conceivably have some 
effect upon general attractiveness, inasmuch 
as color, shading, and lightness and darkness 
were changed to a degree, a third group of 
judges, four in number, rated the projected 
slides on the same six-point scale of attrac- 
tiveness used previously, A range of ratings 
was obtained from eight to 19. For the final 
sample, 20 slides were selected, 10 LB and 
10 SB, all falling within the range of 12 to 15 
on attractiveness, The average of the 10 LB 
slides was 13.2, for the SB, 13.3. An effort 
was made to match the slides as much 45 
possible With respect to the position of the 
woman, i.e., to have both members of a pai! 
Standing or kneeling, both in profile or front 
view, etc. 

a order to help disguise the purpose of 
is part of the study, 10 additional pairs, 0° 
a total of 20 pairs, were presented to the sub- 
Jects. The 10 pairs discussed in the previous 
paragraph were the only ones used to select 
subjects on the variable of somatic preference 
a the 10 additional pairs, five were of two 
females, and the other five of two SB fe- 
males. A table of random numbers was us 
determine the order of presentation of thé 
O pairs. A crucial pair was presented on thé 
ist, 6th, 8th, 9th, 13th, 14th, 15th, 17th, 18th 
and 19th trials. For each crucial pair, the oF 
der of presentation of the LB and SB slides 
was alternated throughout the series. 
rolled i Subjects were 169 male students €07 
ey < an introductory psychology course # 

lo State University. They were seen I 

a 8roups ranging in number from five t° 


_At the beginning of the experimental ses- 
sion, seven TAT cards intended to elicl 
dependency themes were administered. T° 
cards chosen for this purpose were 4, 6BM, 
TBM, 12, 13B, 14, and 18BM. The test W3 
presented as one of creative imagination whic 
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the examiner was attempting to standardize 
(the TAT had not been discussed in the course 
yet), and subjects were given five minutes to 
Write each story. Aside from these departures, 
the usual TAT instructions were given. At the 
Conclusion of the TAT administration, and 
following the distribution of answer sheets, 
these instructions were read: 


As you may know from your psychology course, 
Professor Sheldon of Harvard has done considerable 
Tesearch on the different kinds of body-types. We are 
interested in following up some of his research by 
finding out what female body-types are generally pre- 
ferred by the typical American male. One reason for 
this research is that the kinds of body-types pre- 
ferred by American men may have some effect on 
their choice of marital partners, and this in turn can 
affect the body-types of future generations. 

You will be presented with 20 pairs of slides. Each 
slide contains the picture of an attractive nude fe- 
male. The numbers on your sheet stand for the pair 
number. Following each pair number is an A and a 
B. The A represents the first slide of the pair that is 
Presented to you and the B represents the second 
slide of the pair, After looking at both slides of the 
pair, you are to put a circle around the A if you like 
the first slide better or put a circle around the B if 
you like the second slide better. For each pair, you 
encircle the letter representing the slide you prefer. 

You are requested not to talk or make any spon- 
taneous exclamations during the presentation of the 
slides. The slides must be shown in silence, 


The entire sequence of slides required about 
seven or eight minutes for presentation. Each 
slide was projected for five seconds, three 
seconds were allowed between the slides of a 
pair, and five seconds were allowed between 
adjacent pairs. Cooperation of the subjects 
was good and, with rare exceptions, little flip- 
Pancy was noted. Randomly obtained com- 
ments at the end of the experimental session 
tevealed a general lack of awareness concern- 
ing the crucial variable in this part of the 


Study. 
Results 


The distribution of somatic preferences ap- 
Proximated a normal distribution. As previ- 
ously stated, the judges showed a distinct 
Preference for large-breasted females, but the 
equation of the two groups for general attrac- 
tiveness insured the variability which was ob- 
tained in the preferences of subjects. Subjects 
who selected from 0 to 3 of the large-breasted 
females were defined as the SB group (N = 
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28). The LB group was comprised of those 
who selected from 8 to 10 of the large-breasted 
females (VY = 35). Inasmuch as the mean 
number of LB preferences was 5.56, a middle 
or no-preference group was obtained by se- 
lecting those subjects with either five or six 
large-breasted preferences (W = 57). 

The TAT stories of all three groups were in- 
dependently scored by the writer and a gradu- 
ate student in clinical psychology. Neither had 
any awareness of the subjects’ somatic pref- 
erence ratings at the time of the TAT scoring. 
Following the procedure of Kagan and Mus. 
sen (5), a story was scored as D (for depend- 
ency) if the hero sought help from another 
person in solving a personal problem or was 
disturbed over the loss of a source of love and 
affection. In most respects, D would be similar 
to Murray’s n succorance. Only one D theme 
was scored for each story so that for any 
given story D was simply scored present or 
absent. Of a total of 840 stories there was 
agreement on 782. This 93% agreement in- 
dicates considerable reliability in scoring, A 
third judge arbitrated the 58 stories on which 
there was disagreement. 

The number of D themes in the three 
groups of 120 protocols ranged from zero to 
six with a median of one. In both the middle 
and LB groups the distribution of D themes 
was markedly skewed in the positive direction, 
Chi square was therefore used to test for the 
significance of differences between groups. The 
results are given in Table 2. For each of the 
three comparisons the median value was one, 


Table 2 


Chi-square Differences Between SB, Middle and 
LB Groups in TAT D Themes 


TAT D Themes 


e 
At or 
Above Below Chi 

Groups Median Median Square p 
SB 18 10 6.76 
LB 11 24 i 
SB 18 10 5.70 .02 
Middle 21 36 
LB il 24 é 28 60 
Middle 21 36 


374 


The SB group gave significantly more D 
themes than either the LB or middle group. 
There was no significant difference between 
the middle and the LB groups. 


Discussion 
The present result is not what might have 
been expected on the basis of Freudian the- 
ory. In general, the theory has tended to view 
behaviors as the products of either excessive 
frustrations or excessive satisfactions (rein- 
forcements) in an earlier period. With respect 
to the variable under consideration here, most 
analytically oriented writers who have dis- 
cussed the matter at all have held that large 
breast preference is a result of earlier oral 
frustration. In Levey’s discussion of the ulcer 
patient, for example, it is suggested that the 
frustrated dependency needs of the patient 
sought satisfaction in a constant preoccupa- 
tion with large breasts. Ostensibly, such fe- 
males should be perceived as potentially more 
nurturant for basically dependent males if the 
important determinant underlying such a so- 
matic preference is frustrated dependency. 

If the expression of dependency needs in 
TAT stories can be taken to signify conflicts 
in this area predicated on the frustration of 
such needs, and if the absence of these same 
needs in TAT stories implies experiential re- 
inforcement of them, the present results are 
more in accord with the view that drive 
strength is increased by reinforcement. Large 
breast preference would be regarded as the 
consequence of continued satisfaction of de- 
pendency needs rather than their frustration. 
Such an interpretation is consistent with the 
findings of Bernstein (1). In studying the ef- 
fects of infantile experiences on later activi- 
ties, he raised the question of whether it was 
the child with the greatest amount of sucking 
practice as an infant who continued to use the 
sucking act as a preferred mode of behavior 
or, rather, the child with the least amount of 
sucking practice who continued to seek satis- 
faction through suckirg. He found that the 
amount of sucking reinforcement experienced 
by the group whose mothers reported re- 

sistance to weaning exceeded the sucking re- 
inforcement for the group without weaning 
difficulty. His results, then, support a rein- 
forcement rather than a frustration theory of 
drive strength. Sears and Wise (7) are of the 
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same view in asserting that the strength of 
the oral drive varies with the number of op- 
portunities for its reinforcement so that the 
longer the child feeds by means of sucking, 
the stronger his oral drive will be. The present 
result is consistent with such formulations if 
the assumptions concerning the nature of TAT 
dependency can be maintained. 


Summary 


The purpose of this study was to ascertait 
possible relationships between somatic pref- 
erence (preference for either large- or small- 
breasted females) and dependency as meas 
ured by the TAT. After writing stories t° 
seven of the TAT cards, 169 male subjects 
were presented with 20 pairs of slides; 
pairs consisted of a small-breasted and a larg% 
breasted female, previously equated on attrac” 
tiveness. On the basis of the subjects’ selec 
tions, large breast preference, small breas 


preference, and no preference groups wens 
elicited. 


_ The small breast preference group gave sig 
nificantly more TAT dependency themes t 
either of the other two groups. Speculation’ 
for this result, which is contrary to a wide 
held Freudian hypothesis, are offered on 
basis of a reinforcement theory of learning: 
Received December 6, 1956. ‘ 
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Identification, Parent-Cathexis, and Self-Esteem’ 


Sidney M. Jourard? 


Emory University 


Clinical observations provide the basis for 
suspecting that identification with parents, 
feelings and attitudes toward parents (parent- 
cathexis), and self-esteem all cohere as a syn- 
drome. The present investigation is undertaken 
to verify the hypothesis of covariation among 
these variables, and to test relevant hypothe- 
ses concerned with identification and with the 
cathexis-response to persons—in this instance, 
Parents and the self. 


Method 
Hypotheses 


1. Identification with the parents’ person- 
alities varies with the nature of the feelings 
and attitudes of Ss toward their parents 
(parent-cathexis). Positive parent-cathexis is 
related to high identification scores, and nega- 
tive parent-cathexis is related to low identifi- 
cation scores. 

2. Identification with the parents’ person- 
alities varies with the extent to which the 
parents’ personalities are congruent with the 
Ss’ concepts of the ideal mother and the ideal 
father. ` R 

3. Cathexis for the parents varies with the 
extent to which the parents’ traits conform 
with the Ss’ concepts of ideal mother and 
ideal father. 

i Self-esteem varies with: (a) the degree 
of congruence between the Ss’ traits, and their 
Concepts of an ideal self; (b) the extent to 
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which the Ss’ traits resemble those of their 
parents. 


Subjects 


Fifty-six male undergraduates and 56 fe- 
male undergraduates from psychology classes 
at Emory University and at the Georgia State 
College of Business Administration served as 
Ss in the study. Mean age was 23.69 years, 
SD 3.39, for the males, and 20.89 years, SD 
3.39, for the females. 


Materials 


Ten questionnaires were constructed from a 
series of forty personality traits which were to 
be responded to by Ss in different ways. The 
traits were: 


. Sense of humor. 

. Temper. 

- Ability to express self. 

. Ability to express affection. 

- Ability to express sympathy. 

. Self-understanding. 

. Usual mood. 

. General knowledge. 

. Popularity with others, 

. Self-confidence, 

. Ability to accept criticism. 

. Sensitivity to others? feelings. 

. Intelligence level. 

. Capacity for work. 

. Ability to meet new people. 

. Self-discipline. 

. Ability to make decisions, 

. Tolerance of others’ shortcomings. 
19, Ability to overcome self-consciousness. 
20. Ability to relax and ‘let hair down.’ 
21. Depth of feeling. 

22. Sense of responsibility. 

23. Understanding of intimates. 

24. Receptiveness to new ideas. 

25. Attitude toward sex. 

26. Ease of getting to know. 

27. Personality. 

28. Ability to control emotion, 
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29. Ability to put ideas across. 

30. Degree of freedom from fear. 

31. Degree of independence. 

32. Ability to concentrate. 

33. Ways of disciplining others. 

34. Philosophy of life. 

35. Religious beliefs. 

36. Business sense. 

37. Happiness. 

38. Conformity to own moral standards. 
39. Promptness in getting things done. 
40. Ability to act the right way in every situation. 


The traits are seen to be formulated in 
nontechnical language, so as to approximate 
the terms in which Ss think about and de- 
scribe personality. 

The questionnaires which were constructed 
from these traits, and the scores which were 
derived from them, are listed as follows: 

a. Perceived similarity-to-parents question- 
naire. Each S was requested to rate the ex- 
tent to which he resembled his mother and 
father on each of the 40 traits, in accordance 
with the following instructions: 


Below are listed a number of things characteristic 
of yourself, or related to you. Consider each item 
carefully, and decide which of your parents you re- 
semble most in this characteristic. Then, encircle the 
appropriate number beside each trait, in accordance 
with the scale which follows. If you resemble both 
parents on a given trait, then i 
oo. ; encircle two appro- 


+3: Very close resemblance to your father in 
this respect 

+2: Closely similar to your father in this re- 
spect 

+1: Faintly resemble your father in this re- 
spect 


0: Resemble neither father nor mother 

: Faintly resemble your mother in this re- 
spect 

: Closely similar to your mother in this re- 
spect 

: Very close resemblance to your mother in 
this respect 


This scale appeared beside each trait on the 
“Similarity-to-Parents” questionnaire. Total 
perceived similarity scores were computed by 
summing the “plus” and “minus” numbers 
that had been encircled, and were designated 
I, and Im, for rated similarity to father and 
to mother, respectively. 

b. The cathexis questionnaires. The forty 
traits were listed’on separate forms, entitled 
“Feelings about Father,” ‘Feelings about 
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Mother,” and “Feelings about the Self.” The 
Ss were instructed to indicate their feelings 
about these traits of the respective persons by 
encircling a number from the five-point scale 
which appeared beside each trait on the sepa 
rate cathexis questionnaires. The numbers sig- 
nified: 


. Have strong positive feelings; like very much. 

. Have moderate positive feelings. 

- Have no feelings one way or the other. 

. Have moderate negative feelings. 

. Have strong negative feelings; dislike very much. 


ue U N 


Total scores were obtained by summing 
and were designated FC, MC, and SC if 
father-cathexis, mother-cathexis, and self-C” 
thexis (self-esteem), respectively. i 
6 Real- and Ideal-Person ratings. oF 
identical trait-rating forms were prepared 
from the trait list. Each trait was restated 1” 
the form of a continuum, with five scale points 
between the extremes. For example, the ite 
temper was presented as a continuum wi 
even tempered at one pole, and very quick w 
lose temper at the other. The S encircle 
number from 1 to 5, indicating where the pog 
son being rated would fall along this cor 
tinuum. The socially desirable version of eat 
trait was randomly ordered throughout B 
list, so that for some traits, a score of 1 woul 
indicate the desirable pole, while for others» s 
score of 3 or 5 would represent the desirable 
degree. This randomization was done in oF 
to discourage response sets.* d 

The rating forms which the Ss complet? 
were entitled Real Father, Ideal Father, k 
Mother, Ideal Mother, Real Self, and e 
Self. The S was instructed to “Put a circl 
around the number which best describes 
person, with respect to each trait.” 2 

Two sets of discrepancy scores were calci 
lated from these rating scales. One set Poy 
tained to the discrepancy between “real” ae 

ideal” ratings for mother, father, and t 


d 
3 One of the indicators of response sets is & mare 
Preference on the part of Ss for some one resp a 
category, €g., for high, middle, or low number ob 
a numerical rating scale. Inspection of the dati 
tained in the present study showed considera? ues 
traindividual variability, from questionnaire w "pot 
tionnaire, in responding to items. While this a 
proof that response sets were eliminated, it S% P 
the contention that they were minimized. 
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self. These were obtained by subtracting the 
“Teal” ratings from the “ideal” ratings, ignor- 
ing sign, and then summing. Total “real-ideal” 
discrepancy scores were designated Dis-F;-F,, 
Dis-M;-M,, and Dis-S,-S,. 

The other set of discrepancy scores com- 
puted from these forms provided a further 
index of similarity to parents, which is called 
the derived similarity score. These scores were 
obtained by comparing “Real Father” ratings 
with “Real Self” ratings, and “Real Mother” 
ratings with “Real Self” ratings, ignoring sign. 
Each pair of scores yielded a discrepancy with 
a possible range from O to 4. These discrep- 
ancies were summed, and designated Dis-F,-S, 
and Dis-M,S, for differences between father’s 
Personality and the self, and mother’s person- 
ality and the self, respectively. 

In summary, each S completed a total of 
ten questionnaires. The names of these ques- 

i tionnaires, the order in which they were pre- 
sented to the Ss, and the scores which were 
obtained from them are as follows: 


(a) 
(b) 
(c) 
(d) 


Similarity to parents (Z; and J,,—Per- 
ceived similarity scores) 

Feelings about father’s personality (FC 
—Father-cathexis scores) 

Feelings about mother’s personality 
(MC—Mother-cathexis scores) 
Feelings about the self (SC—Self-ca- 
thexis scores, the measure of self- 
esteem) , 

Ideal Mother ratings 

Real Self ratings 

Real Father ratings 

Ideal Self ratings 

Real Mother ratings 

Ideal Father ratings 


(e) 
(f) 
(g) 
(h) 
(i) 
G) 
Discrepancies were computed between each 
Pair of traits and summed through all traits, 
on forms (e) and (i) [Dis-Mi-M,], (g) and 
(i) [Dis-F,-F,], and (f) and (4) [Dis-S-S,]. 
These are the discrepancy-between-“real”-and- 
“ideal” ratings of the mother, father and self, 
and are termed congruence measures through- 
Out the tables. Further discrepancies were de- 
termined and summed between (f) and (g) 
` [Dis-F,-S,] and between (f) and (é) [Dis- 
M,-M,]. These latter comprise the two addi- 
tional measures of similarity between the self 
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Table 1 
Means, Standard Deviations, and Corrected 
Reliability Coefficients of the Scales 
Men Women 
Scale Mean SD r Mean SD ry 
Iz 38.35 22.02 .91 38.31 20.36 .88 
In 34.65 20.65 .90 37.82 19.07 .80 
Dis-F,-S, 34.89 15.59 .84 37.62 14.70 .88 
Dis-M,-S; 33.95 17.25 .88 35.09 12.67 .86 
FC 86.02 26.18 .92 81.40 26.24 .94 
MC 89.40 24.23 .92 87.95 23.39 .94 
SC 87.80 22.23 .92 90.55 19.76 .90 
Dis-F,-F, 36.57 21.40 .93 31.43 21.00 .93 
Dis-M;i-M, 35.76 18.29 .89 35.95 18.71 .92 
Dis-S;-S; 37.18 19.69 .92 39.96 14.53 .86 


and the parents, and are termed derived simi- 
larity scores.* 


Testing Procedure 


The questionnaires were stapled as a book- 
let and distributed to the Ss during a class- 
room period. Instructions for filling them out 
were presented orally. Date of birth, sex, and 
marital status were the only identifying in- 
formation that was requested from the Ss, on 
the premise that relative anonymity would en- 
courage frankness in response. The Ss took 
the booklets home with them, and returned 
the completed forms on a following day. 


Results 
General Results and Questionnaire Reliability 
Means, standard deviations, and reliability 
coefficients for all variables are shown in 
Table 1. Each scale has satisfactory reliability. 
None of the mean scores obtained by males 
and females on each scale showed any signifi- 
cant differences. 


Identification with Parents and Parent-Ca- 
thexis 


Hypothesis 1 may be restated: “If you like 
your parents, you tend to be like them.” 


+The two measures of similarity-to-parents—per- 
ceived similarity scores and’ derived similarity scores 
—did not correlate very highly with each other. I; 
correlated with Dis-F,-S, only — .35 for males, and 
— -20 for females; Im correlated with Dis-M,-S, only 
— .17 among the males, and — .43 among the female 
Ss. In spite of their near-independence of each other, 
the perceived- and derived-similarity scores corre- 
lated in predicted ways with other variables. 
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Table 2 


Correlations Between Similarity-to-Parents Scores 
and Parent-Cathexis Scores 


Men Women 
Index of 
Similarity FC MC FC MC 
I; —.52** —.45** 
Dis-Fr-Sr 43%" : .28* 
In —.38** —.62** 
Dis-M -Sr .20 .38** 


* Significant at .05 level. 
#* Significant at .01 level. 


Table 2 shows that both the perceived (J; and 
Im) and the derived (Dis-F,-S, and Dis-M,- 
S,) indices of identification, or similarity to 
the parents, are significantly correlated with 
the parent-cathexis measures. In one instance 
only did an z fall short of significance, namely 
„that between Dis-M,-S, and MC for males, 
Transformation of the 7’s in this table (and 
all others) to z’s was performed, in order to 
permit cross-sex comparisons of the magnitude 
of r. None of the observed sex differences be- 
tween 7’s in Table 2 were significant. 


Identification with Parents and Parental Con- 
gruence with Ideals 


The second hypothesis predicted that Ss 
tend to “identify” only with those parental 
traits that coincide with their ideals for those 
traits. Thus, the larger the discrepancy be- 
tween the actual (rated) traits of the parents 
and the Ss’ concepts of ideal parent, then the 
smaller are the perceived similarity scores (J, 
and Zm), and the larger are the derived simi- 
larity scores. Table 3 shows that the obtained 
7s among the relevant variables support the 
predictions in magnitude and in direction. It 
may be concluded that Ss tend to identify 
only with those parental traits which are ex- 
emplary for them; i.e., which coincide with 
their ideals. 

Only one significant sex difference in 7’s was 
obtained in Table 3; Im correlated — .26 with 
Dis-M,M, for males, and — .62 for females. 
This difference yielded a ¢ ratio for the corre- 
sponding 2’s of 2.37 ($ about .02). Evidently 
among males, ‘perceived similarity with the 
mothers’ traits is more independent of the 
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mothers’ congruence with ideals than is the 
case for female Ss. p 


Parent-Cathexis and Parental Congruentt 
with Ideals 


It was predicted that the cathexis respons 
to parents’ traits varies with the magnitude 
of the discrepancy between the actual (rat 
traits of the parents, and the Ss’ ideals Í} 
parental traits. Table 4 shows that there is? 
substantial correlation between the cathe! 
measures and the extent of congruence be 
tween parents’ traits and the Ss’ ideals : 
those traits. It is further noted that } 
more highly related to Dis-M,-M, among ig 
male Ss (.75) than among males (.29)- The 
t ratio for the difference between the con” 
sponding z’s was 3.46 (p less than 01). TA 
difference suggests that among males, mo A 
cathexis is more independent of the moth ef 
conformity with concepts of the ideal-mot Í 
than is the case among females. 


Self-Esteem, Identification, and Cong 
of Ideal Self with Real Self 


Self-cathexis (SC) scores indicate the 
tent to which a person likes or dislikes y 
own traits. The SC scores may thus be the 
garded as indices of self-esteem. As Wit? e 
parent-cathexis scores, SC correlated wil ith 
relevant discrepancy scores (Dis-Si-Sr) jf 
aor beyond the .01 level (6 

es, and .53 for females). 

The fact of covariance —_— SC scor 
and the identification estimates (Jy), 9 
F,-S, and Dis-M,-S,) is shown in Table jy 
The index of self-esteem (SC) is significa” 


ruent? 


ex” 


Table 3 


Correlations Between Similarity-to-Parent Scor 
and Parental Congruence with Ideals 


e5 


is 

Index of Dis- Dis- Dis- zi 
Similarity Pehe MM Fi-Fr Mi 
I; ** 

3 — 5 T** _ F 

Dis-F,-S, 44** 30"* a 

= —.26* x | 
Dis-M-S, 64** — 


* Significant at .05 | 
Significant at .01 level 
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correlated with all measures of similarity to 
the parents among male Ss, and with all in- 
dices among the females except Dis-F,-S,. 
The relative slight magnitude of the 7’s sug- 
gests that self-esteem is only partly deter- 
mined by parental identification, probably be- 
Cause identification itself seems to occur only 
when the parents’ personalities are exemplary. 


Discussion 
General Results 


The fact that none of the comparisons 
between mean scores of males and females 
that were reported in Table 1 showed signifi- 
cant differences warrants some comment. One 
might expect, for example, that males would 
obtain higher similarity-to-father scores than 
women, and that women would obtain higher 
similarity-to-mother scores than men. Fur- 
ther, it might be predicted that men would 
show greater similarity to their fathers than 
to their mothers, and vice versa for the fe- 
male Ss. The lack of significant differences on 
these variables may possibly be attributed to 
the fact that the traits employed in the ques- 
tionnaires are highly general in nature, and 
hence cannot be readily sex-typed. 


Identification with Parents, Parent-Cathexis, 
and Parental Congruence with Ideals 


Perceived- and derived-similarity scores 
were shown to be correlated significantly with 
the parent-cathexis scores. These findings lend 
support to Mowrer’s theory of “developmental 
identification” which, loosely restated, asserts 
that positive feeling toward the parents fos- 
ters identification with their traits (6). It 
should be noted that the similarity scores used 
herein are at best only crude indices of identi- 


fication. 
Table 4 


Correlations Between Parent-Cathexis Scores and 
Parental Congruence with Ideals 


Men Women 
Congruence 
Measure FC MC FC MC 
Dis-F;-F, 383** .78** 
Dis-M;-M, 229° tone 


* Significant at .05 level. 
** Significant at .01 level. 
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Table 5 


Correlations Between Self-Cathexis Scores and 
Measures of Similarity to the Parents 


Index of 

Similarity Men Women 
Is —.37** —.43" 
Dis-F,-S; -50** 19 
In $ —.38** —.32* 
Dis-M,-S, SiS .28* 


* Significant at .05 level. 
** Significant at .01 level. 


It is always possible that the Ss’ observa- 
tions and judgments of similarity were autis- 
tic, i.e., strongly influenced by their feelings 
toward their parents. Indeed, the very fact of 
correlation between cathexis scores and simi- 
larity scores suggests that autism may have 
influenced their judgments. Finally, any real 
similarity between the Ss and their parents 
may have derived from sources other than 
identification, viz., patterned socialization and 
“direct tuition” practises (1, pp. 56-59) 
which produced similarities “by accident.” 

The findings with respect to identification 
and parental congruence with ideals suggest 
that identification is a selective process. It 
may be hypothesized that children do not 
identify with al of their parents’ traits: 
rather, they select those traits which seem 
worthy of emulation, which will be instru- 
mental in the attainment of assorted valued 
ends, viz., parental approval, success in 
achievements, etc. (cf. 3, 6). 


Parent-Cathexis and Parental Congruence 
with Ideals 


Cathexis has been redefined from its origi- 
nal psychoanalytic meaning by Parsons. He 
states that cathexis refers to “a state of the 
organism—a state of euphoria or dysphoria— 
in relationship to some object. . . . It is ob- 
ject-oriented affect. . . . It involves attaching 
affective significance to an object” (8, p. 10; 
his italics). In the present context, parent- 
cathexis refers to the feelings of liking-dislik- 
ing for parents’ traits. The idea that cathexis- 
responses might be related to the degree of 
congruence between actual or perceived char- 
acteristics of the object, and ideals pertaining 
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to that object, was suggested by an earlier 
work dealing with body-cathexis (4). As a 
hypothesis for further exploration, it is sug- 
gested that cathexis for other persons, the self, 
or any object is in part a function of the con- 
gruence of the object with Ss’ ideals. This 
hypothesis has bearing on the theory of inter- 
personal attractiveness (7), sociometry, aes- 
thetics, etc. 


Self-Esteem, Identification, and Congruence 
between Ideal Self and Real Self 


Self-esteem—positive cathexis for the self— 
appears to hinge upon congruence between 
the “real self” and the “self-ideal.” The ob- 
tained correlations between SC scores and 
similarity scores suggest that the parents’ per- 
sonalities may have served, not only as a 
model for the “real self” of the Ss, but also 
as the model for their self-ideals. Most psy- 
choanalysts, in fact, trace the origin of the 
self-ideal (superego and ego-ideal) to identifi- 
cation with parents. The present findings may 
be viewed as partial support to the psycho- 
analytic theory of the relationship between 
-7 and the superego (2, pp. 103- 

05). 


Summary 

Fifty-six male and 56 female college stu- 
dents were tested with a series of question- 
naires designed to measure similarity with 
parents’ personalities, parent-cathexis, and 
self-cathexis. 

Scores indicative of the degree of similarity 
between the self and the parents were found 
to vary with parent-cathexis, and with the de- 
gree of congruence between the parents’ per- 
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sonalities and the Ss’ concepts of the ideal 
parent. 

Parent-cathexis scores were found to vary 
with the degree of congruence between the 
parents’ rated personalities, and Ss’ concepts 
of the ideal parent. 

Self-cathexis (a measure of self-esteem) 
was found to vary with (a) the congruence 
of the “real self” with the “self-ideal” and 
(b) with the degree to which Ss resembled 
their parents’ personalities. 

__ The findings were related to the theory of 
identification, and to a general theory of the 
cathexis-response to objects. 

The overall results appear to confirm the 
clinical observation that identification WÌ 
parents, feelings and attitudes to parents, aP 
self-esteem all cohere as a syndrome (cf. 9 
pp. 31-62). 


Received November 21, 1956. 
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Perceptions of Significant Family and Environmental 
Relationships in Aggressive and Withdrawn Children’ 
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Teachers College, Columbia University 


The mediating function of cognitive proc- 
esses has been given significant attention in a 
variety of theories and experiments. Tolman 
(11), for example, emphasizes the importance 
of expectancies in guiding behavior; Hull (4) 
has suggested that pure stimulus acts con- 
tribute stimuli necessary for problem solving; 
Mowrer (7) regards preparatory set as a de- 
terminant of behavior. That such factors as 
value, need, and expectancy may influence 
perceptions has been demonstrated in experi- 
ments reported by McClelland and Liberman 
(6), Vanderplas and Blake (12), Bruner and 
Postman (1), and others (2, 3, 5, 8, 9). 

The present study is an attempt to extend 
investigations of the relationship between 
cognitive and other behavioral processes to 
the problem of personal adjustment. The spe- 
cific question dealt with was whether char- 
acteristic patterns of overt adjustment are 
systematically associated with characteristic 
forms of perceptual behavior. Pursuit of this 
inquiry entailed the identification of groups 
of subjects differing markedly in overt ad- 
justment patterns, the formulation of hy- 
potheses accounting for such behavioral dif- 
ferences in terms of differing cognitions, and 
a final determination of relationships between 
the cognitive and adjustment variables. 

Aggression and withdrawal in school-age 
children were selected as the adjustment vari- 
ables. Using as a guide the phenomenological 
approach (10), which postulates a one-to-one 
relationship between an individual’s behavior 
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and his psychological field, two factors in the 
perceptual fields of aggressive and withdrawn 
children may be considered significant in de- 
termining behavior. The first factor is the de- 
gree to which the individual perceives the 
world to be threatening. Thus, aggressive chil- 
dren, apparently less deterred by threat, are 
likely to perceive the world as less threaten- 
ing than do withdrawn children. It is equally 
possible that aggressive and withdrawn indi- 
viduals do not differ in their perceptions of 
threat but that, in the face of equal dangers, 
a second factor, the individual’s perception of 
his resources for dealing with the threat, be- 
comes the significant cue determining overt 
behavior. It was predicted, therefore, that ag- 
gressive and withdrawn children would differ 
in the degree to which they perceive the world 
to be threatening and in their perceptions of 
their resources for dealing with threat. Specif- 
ically, the hypotheses tested in this study are 
that in perceptual tasks related to significant 
family experiences, aggressive children: 

1. Estimate the size, strength, and ability 
of child figures to be greater than do with- 
drawn children and, similarly, estimate the 
size, strength, and ability of child figures to 
be greater in relation to those of the parental 
figures than do withdrawn children, 

2. Expect punishment for various acts to 
be less severe than do withdrawn children. 

3. Describe the immediate outcome of situa- 
tions as more favorable than do withdrawn 
children. 


Subjects 


The subjects used in this study were 101 
boys aged six through ten selected from among 
1,531 children in the first íve grades of a 
public school system in a small New York 
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State city. Selections were made by a com- 
mittee consisting of a director of elementary 
guidance, a school psychologist, school prin- 
cipals, and classroom teachers. The subjects 
were divided into an experimental group of 
33 boys manifesting persistently aggressive 
behavior (Group A); an experimental group 
of 35 boys manifesting persistently withdrawn 
behavior (Group W); and a control group of 
33 well-adjusted boys (Group N). The groups 
were found to be generally similar in age, 
grade, and social class membership distribu- 
tions. Grade achievement scores on reading 
and arithmetic tests, used as an estimate of 
intelligence, showed the aggressive and with- 
drawn groups to be generally similar. The 
well-adjusted group scored higher on both 
tests than did the aggressive or the withdrawn 
groups. These results, significant at the 5 per 
cent level of confidence, may be attributed, in 
part, to the clinically observed tendency for 
disturbed children to be less efficient in Jearn- 
ing. It is also probable, however, that moder- 
ately brighter children were selected as mem- 
bers of the well-adjusted group. 


Experimental Tasks 


Each subject was given two perceptual 
tasks: (a) to answer questions about seven 
pictures of a family group in presumably 
anxiety-arousing situations (the Family Situa- 
tions Task); and (b) to draw a picture of his 
parents and himself (the Family Drawings 
Task). These individually administered tasks 
were designed to assess the subject’s percep- 
tions of five variables pertinent to the hy- 
potheses being tested. The variables were: 


1. Size, measured by the height in centimeters of 
drawings of parent and child. 

2. Strength, measured by responses to questions 
concerning tests of physical strength between a 
father and son. 

3. Ability, measured by descriptions of the be- 
havior of parent and child figures in coping with 
critical situations. 

4. Expectations regarding punishments. 

5. Expectations regarding the outcomes of situa- 
tions. 


The Family Situations Task consisted of 
seven original pictures of a family group 
which were drawn by an artist from detailed 
instructions. A questionnaire containing thirty 
items was developed to accompany the pic- ` 
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tures. Photographic reproductions of the pic 
tures were used in administering the task 
Brief descriptions of the pictures are pre- 
sented: 


Card One. Father, mother, and a little boy 2 
walking into the dining room to have breakfas 
Their dog is pulling at the tablecloth with all the 
food on it and threatening to spoil the breakfast. 

Card Two. The father and son are having ê 
weight-lifting contest. Seven differently sized weights 
are in front of them. ; 

Card Three. The family is reading in the living 
room. They are so engrossed in their reading that 
they do not notice a fire that has started in a curtail 
; Card Four. An obviously angry father is confront- 
ing a boy. i 

Card Five. The parents are sitting at a table # 
the kitchen, engrossed in conversation. The boy 
Tunning in to them, obviously excited. 

Card Six. Father and son are having a tug-0 fa 
_ Card Seven. The family is standing in front % | 
lion’s cage at the zoo. The lion is angry and has ? 

a bar of his cage, 
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_The Family Drawings Task consisted ol 
single drawing made by the subject of 2” 

Parents and himself. The directions wea 
“Make a fast drawing of your father, Yo 
mother, and you. Make everybody standing 
up.” Drawings were done on a single sheet ° 

white paper, 83 by 11 inches. l 


Procedures in the Analysis of Data 
o 


Data for the size and strength variables y 
the first hypothesis were scored objecti¥ ti 
Size was measured by the height in Ce ag 
meters of drawings of parents and child e 
ures; strength was measured by verbal of 
Sponses to seven questions relating to test p 
physical strength between a father an Fe | 
The third variable dealt with in the first “i 
Pothesis, ability, was measured by cr | 


verbal descriptions of the behavior of par 
and child figures in coping with situatio”? ot 
volving varying degrees of threat to COM oe 
and safety. Ability was rated by three J goo { 
on a four-point scale of ability to cope © yy¢ 
tively with the threatening aspects ° 
situation. ge 

The second hypothesis, which conte” ent 
that aggressive children expect punis™ go, 
for various acts to be less severe n F "| 
withdrawn children, was measured entir® oi i 
ratings .of the subject’s description of Pot 
ment inflicted upon a boy for various © 
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Table 1 
Heights of Family Figures and Ratio of Height of Boy to Height of Father 


(Measured 


in centimeters) 


Difference Between 


Groups 
Group A Group W Group N Avs.W A&Wys.N 
Figure M SD M SD M SD. t t 
Boy 7.04 3.56 6.18 2.96 5.36 2.09 1.08 2.29* 
Father 9.31 3.92 8.76 3.59 757 254 -64 2.28* 
Mother 9.29 4.08 8.43 3.97 7.08 244 -88 Qos 
Boy/Father (ratio) 77 .25 .72 23 22 .19 86 ll 


* Significant at .05 level. 


Three judges rated the responses on a four- 
Point scale of severity of punishment. 

The third hypothesis, asserting that aggres- 
sive children describe the immediate outcomes 
of situations more favorably than do with- 
drawn children, was measured by responses 
to seven questions requiring the subject to 
predict the outcome of situations. Five of the 
questions were objectively scored, while two 
questions were rated on a four-point scale of 
severity or mildness of outcome. In all cases 
where ratings were used, agreement of the 
judges on 85 per cent of the ratings was set 
as a minimum standard of reliability. 


Results 


The results obtained for the first hypothesis 
indicated that aggressive children do tend to 
perceive child figures to be stronger in rela- 
tion to parent figures than do withdrawn chil- 
dren, but that the two groups have essentially 
similar perceptions regarding size and ability. 
The hypothesis that aggressive children ex- 
pect punishment to be less severe than do 
withdrawn children was not supported. Sub- 
stantial evidence was found, however, for the 
hypothesis that aggressive children describe 
the outcomes of situations as more favorable 
than withdrawn children. 


Size, Strength, and Ability 
Table 1 shows that the disturbed children 


(Group A and Group W) tended to have simi-_.—-- 


lar perceptions of the size of child-frgurés a 
Parent figures. On the othérchanda THAN 1 


demonstrates that there was a \fliffd}enide -bé- 


\ pated 


i FR thee te 
Me aos, INO * 


tween the disturbed children, taken as one 
group, and the nondisturbed children (Group 
N), with regard to the size variable. The dis- 
turbed children drew both parent and child 
figures taller than did the nondisturbed chil- 
dren. The disturbed and the nondisturbed 
groups did not differ, however, with respect 
to the ratio between the height of the boy 
figures and the height of the father figures, 
The finding of absolute size differences with- 
out associated differences in the perceptions 
of relative size has only minor relevance, since 
the absolute size differences cannot be ex- 
plained in terms of differing perceptions of 
threat or resources for dealing with threat. 
In their responses to two of the seven ques- 
tions concerning relative strength, the aggres- 


Table 2 
Estimated Weight-Lifting Ability in Pounds 


Differences 

Between 

Groups 
Group Group Group R 


Situation A W N Avs. W 
Boy’s present ability 
Mean 61.0 409 50,7 
SD 30.8 25.9 19.7 
t i 2.86** 
Boy’s future ability 
Mean- ae À 1614 117.9 126.8 
das SBegearch 84.1 46.4 48.0 
i) t 3E 2.58** 


aag OO 


0 Significant at :01 level. 
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Table 3 


Optimistic versus Pessimistic Outcomes of Situations 


Groups Compared 


Avs. W A&Wvs.N 
a re 
Situation x? p £ $ 
Save or spoil breakfast? 5:81 <.02* 
Extent of damage from fire? 02 <.99 4,04 <30 
Will parents give boy money? 8.43 <.01** 
Father lost his job, what will happen to the family? 4.61 <.30 07 99 
Long, short, or medium amount of time for father to i 
find another job? 4.12 <.20 1.43 <.70 
Amount of time needed to find another job? 2.70 <30 3.14 <.30 
What will happen to family if lion breaks out of cage? 1.45 <70 £03 <.30 


* Significant at .05 level. 
** Significant at .01 level. 


sive children perceived child figures to be 
stronger in relation to parental figures than 
did the withdrawn children; the responses to 
the other five questions showed no differences 
between the experimental groups. Table 2 pre- 
sents the results for the two questions for 
which significant findings were obtained. 

Statistical analysis of the results obtained 
for the ability variable, using 24 chi-square 
tests, produced no Statistically significant find- 
ings. The reliability of the results obtained for 
this variable may be questioned, however, for 
several reasons. Two of the three pictures 
used to obtain measures of relative ability, the 
fire scene and the lion scene, deal with situa- 
tions which not all of the children could be 
expected to relate to their own experiences, 
In addition, the questions used for this vari- 
able permitted wide latitude in responses; 
such questions were consistently unrevealing 
of differences between groups throughout the 
study. 


Punishment 


The results obtained for the punishment 
variable do not support the hypothesis that 
aggressive children expect punishment to be 
less severe than do withdrawn children. The 
statistical analysis of the ratings of the verbal 
responses, using six chi-square tests, produced 
no results indicative of differences between the 
experimental groups or between the disturbed 
and the nondisturbed children. It appears, 
therefore, that aggressive and withdrawn chil- 


dren, despite marked differences in overt bi 
havior, tend to have similar expectations T° 
garding the severity of punishment. It shou E 
be noted, however, that only one picture W4 
used in measuring this variable and that th? 
questions used permitted wide latitude in 1° 
sponses, 


Expected Outcomes 


The findings concerning expected outcomes 
as shown in Tables 3 and 4, support the A 
pothesis that aggressive children describe t d 
immediate outcomes of situations as more fê 
vorable than do withdrawn children. Table 
demonstrates that significant affirmative © f 
sults were obtained for two of the seven quey 
tions. When responses to all seven question” 
Were dichotomized as tending to be eith® 


Table 4 


Number of Situations Described as Having 
Favorable Outcomes* 


Different 
Betwee” 
Group? 
Grou; aoe 
Measure A p Grong Gieh Avs w 
Mean 3.49 2.23 3.42 
SD i : i 
; 31 1.46 1.63 3,09" 
oe, tel 
* An optimism score Was arrived at for each subject DY dps of 


ye Ů 


Perceptions of Aggressive and Withdrawn Children 


optimistic or pessimistic, the mean number of 
optimistic responses for aggressive children 
was found to be significantly greater than that 
for withdrawn children, as shown in Table 4. 
These findings constitute substantial evidence 
for the validity of this hypothesis. 


General Conclusions 


The results obtained support the conclusion 
that differences in the overt adjustment pat- 
terns of aggressive and withdrawn children 
are related to differences in expectations about 
the outcomes of situations that are relevant 
to the needs and security of the individual. 
It would appear that aggressive children are 
guided in their overt behavior by a greater 
degree of confidence in a favorable outcome 
than are withdrawn children. The evidence 
that aggressive children tend to perceive child 
figures to be stronger in relation to parent fig- 
ures than do withdrawn children is considered 
to be most economically explainable as further 
support for the conclusion that aggressive 
children tend to be more optimistic than do 
withdrawn children. 


Discussion 


The results are consistent with the general 
contention that behavior is mediated by cen- 
tral, cognitive processes. The factor of mood, 
with aggressive children tending to be opti- 
mistic, and withdrawn children tending to be 
pessimistic, emerges as significant in guiding 
the behavioral adjustments of children. Pre- 
dictions that aggressive and withdrawn chil- 
dren differ in the degree to which they per- 
ceive the world to be threatening, and in their 
perceptions of the adequacy of their resources 
for dealing with threat were, in general, not 
supported. In view of the over-all results of 
the study, the differences in the overt adjust- 
ment patterns of aggressive and withdrawn 
children appear to be most adequately and 
simply explainable in terms of reward learn- 
ing theories. Aggressive children appear to 
have learned from experience that aggression 
may lead to reward, either in the attainment 
of goals or in the reduction of anxiety, while 
withdrawn children, perhaps having had less 
fortunate experiences in meeting needs in a 
direct manner, appear to have learned that 
withdrawal and renunciation of goals are most 
effective in allaying anxiety. 
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Summary 


This study sought to determine whether 
characteristic patterns of overt adjustments 
are systematically associated with character- 
istic forms of perceptual behavior. Specifically, 
it was hypothesized that in perceptual tasks 
related to significant family experiences, ag- 
gressive children estimate the size, strength, 
and ability of child figures to be greater both 
in absolute terms and greater in relation to 
the size, strength, and ability of parental fig- 
ures than withdrawn children; that they ex- 
pect punishment to be less severe, and de- 
scribe immediate outcomes of situations as 
more favorable than withdrawn children. The 
results lead to the conclusion that differences 
in the overt adjustment patterns of aggressive 
and withdrawn children are related to differ- 
ences in expectations about the outcomes of 
situations relevant to the needs and security 
of the individual. 


Received December 10, 1956. 


References 


1. Bruner, J. S., & Postman, L. Perception, cog- 
nition, and behavior. J. Pers., 1949, 18, 14-31. 

2. Cowen, E. L., & Beier, E. G. The influence of 
“threat-expectancy” on perception. J. Pers; 
1950, 19, 85-94. 

3. Diven, K. Certain determinants in the condition- 
ing of anxiety reactions. J. Psychol., 1937, 3, 
291-308. 

4. Hull, C. L. Principles of behavior. New York: 
Appleton-Century, 1943. 

5. Lazarus, R. S., Yousem, H., & Arenberg, D. Hun- 
ger and perception, J. Pers., 1953, 21, 312-328. 

6. McClelland, D. C., & Liberman, A. M. The effect 
of need for achievement on recognition of 
need-related words. J. Pers., 1949, 18, 236-251. 

7. Mowrer, O. H. Learning theory and personality 
dynamics. New York: Ronald Press, 1950. 

8. Razran, G. H. S. A quantitative study of mean- 
ing by a conditioned salivary technique (se- 
mantic conditioning). Science, 1939, 90, 89-90. 

9. Riess, B. F. Semantic conditioning involving the 
galvanic skin reflex. J. exp. Psychol., 1940, 26, 
238-240. 

10. Snygg, D., & Combs, A. W. Individual behavior, 
New York: Harpér, 1949. 

11. Tolman, E. C. Purposive behavior in animals and 
men. New York: Appleton-Century, 1932. 

12. Vanderplas, J. N., & Blake, R. R. Selective sensi- 
tization in auditory perception. J. Pers., 1949. 
18, 252-266. a i 

13. Warner, W. L., Meeker, M., & Eells, K. Social 
class in America. Chicago: Science Research 
Associates, 1949, 


al of Consulting Psychology 
Van, No. 5, 1957 


The Role-Taking Hypothesis in Delinquency’ 
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and Carlos A. Cuadra 
RAND Corporation 


On the basis of an hypothesis concerning 
psychopathy, Gough has devised a scale for 
the identification of predisposition for delin- 
quent behavior (2, 6, 7). The present study 
is an experimental examination of the opera- 
tion of the hypothesized variable in the pro- 
duction of scores on the scale. 

Gough finds the Psychopath unable to “look 
upon the self as a ‘social object’ ” or to 
“elaborate an adequate and realistic set of 
social expectancies and critiques” (7, p. 207 J: 
For some clinical observers, however, one of 

) the chief characteristics of psychopaths is 
their singular facility for slipping from one 
role to another, assuming the demeanor most 
opportune for the moment. Gough emphasizes 
that it is not a deficiency of role-playing abil- 
ity to which he refers, but an absence of the 
“residuals which ordinarily accrue as a con- 
sequence of interactional experience . , .” (7, 
p. 207). However, these residuals are not 
identified specifically, and Gough’s concern 
seems to remain with the supposed inability 
to see one’s self as seen by others. , 

Gough’s Delinquency Scale (De)? incor- 
porates this hypothesis and appears to have 

practical screening efficiency in differentiating 
between delinquent and nondelinquent sam- 
ples (5, 6, 7). The item “I often think about 


1This study was inaugurated while both investi- 

tors were members of the staff of the Clinical Psy- 
esl Service of the VA .Hospital, Downey, Illi- 
hls The authors wish to acknowledge the meee 
and “assistance of the hospital administration and E 
Nursing Education Service throughout the course 

ject. 7 

N renamed the Socialization (So) on pin 
scored in reverse in the California Psychologica! 
ventory (6). 


how I look and what impression I am maki 
upon others” (answered False) is givens Je- 
example of a cluster of items reflecting ro. 4 
taking deficiencies, insensitivity to intera 
tional cues and the effects of one’s own as 
havior on others” (7, p. 209). Items such E 
this seem rather to be reports of indie 
to the opinion of others, not direct manifes a 
tions of incapacity in role-taking. Whatev' 
the basis for the efficacy of De for screening 
the data presented by Gough do not wan 
the conclusion that incapacity in role-taki? 
has been demonstrated. It could be eA 
with equal justification that part of the sca Fo 
power lay in items admitting past delinquni 
cies, e.g., “I have never been in trouble we 
the law” answered False, since seven items 
the scale are of this nature, E si- 
Data pertaining to the relative social aaa 
tivity of subjects who receive high scores ja 
De will be presented in this report. sone 
sensitivity was defined experimentally 2S ions 
ability of a subject to predict descripti 
made of him by peers. A normal sample ent 
chosen instead of a sample of delindi 
subjects for methodological reasons conca < 
with equivalence of task and with acquai 
anceship variables. The study deals with a 
examination of a delinquency scale Ete 
than with delinquents themselves. Refere on” 
to “High De scorers” are not intended to © e 


: e 
vey the impression that these subjects W 
delinquent. 


Procedure 


; ; nde! 
Successive classes of student pe gle nos 

going training at a VA neuropsychiatr! tested 

pital were used as Ss. Each class was 
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near the close of three months’ residence at 
the hospital, where the students had worked 
on wards, attended classes, and lived together. 
There was ample opportunity for observation 
of each other in a variety of working condi- 
Hons, some of them stressful. 

For the purposes of this study, each S was 
assigned to a group of four Ss. Each S was 
asked to describe herself on an adjective 
Checklist (I), marking every adjective in the 
list with a plus when it was true or generally 
true, with a minus when it was false or gen- 
erally false. As a second task (II), the S de- 
Scribed each of the other members of her 
group of four. Finally, she attempted to pre- 
dict how she would be described by the mem- 
bers of the small group as a group (III). Pre- 
Cautions of anonymity and seating were taken 
to foster candid appraisals. 

The forced-choice technique was used be- 
Cause, in a pilot study with Navy hospital 
Corps students in which the adjectives were 
checked only when they applied to a subject, 
there seemed to be a spurious accuracy in 
predicting group description. The fewer ad- 
jectives chosen, the higher the accuracy score. 
By employing forced choices, the correlation 
between accuracy and the number of adjec- 
tives marked plus was reduced to zero. 

The California Psychological Inventory, of 
which the scale is a part, was administered as 
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a final procedure. Complete data were avail- 
able for 204 Ss; partial data existed for 25 
more Ss. 


Quantitative Data 


The correspondence between an S’s pre- 
dicted description (III) and the descriptions 
which are made of the S by her peers (com- 
posite II’s) is the variable of chief interest. 
A score was given whenever, on a given ad- 
jective, two of the three peers recorded the 
same sign as that recorded by the S. Total 
score, Predictive Accuracy, is the sum of the 
adjectives on which such agreement occurred. 

The correspondence between S’s self-de- 
scription (I) and the peer-description (com- 
posite II’s) was scored in a similar manner 
and the total score called Self-peer Corre- 
spondence. 

Additional scores were evolved in the course 
of analysis of the data. They will be described 
after the report of initial results. 


Peer-nomination Data 


A second major source of data was provided 
by material gathered some months after the 
initial part of the study was completed. All Ss 
were contacted by mail and asked to select 
from a list of their classmates five whom they 
considered the most insightful of the class and 
five whom they considered least insightful. In- 


Table 1 
Comparison of Scores of High-De Group and Low-De Group 


High-De group Low-De group 

Variable Mean SD Mean SD CR 
Predictive Accuracy 161.08 20.45 178.38 13.81 4.90** 
Estee oea 161.56 17.70 rga 19:66 Stat 

ite IP 

G EE Cit 356.73 199.62 519.46 133.91 3.41%* 
Coe ee Count 382.95 148.36 509.50 119.50 3.25** 
(on T) 148.3 e i 

Sterene 126.12 32.46 39 25.08 3.79 
Favorable Self-description 20.58 7.83 26.45 6.71 3.99**+ 

Favorable Peer-description 29.52 Sel p cty 58 

Favorable Expected 

nti 19.24 9.20 26.09 6.00 4.36** 
geen description 47.40 1695 37.32 1407 3.20% 


Anticipated Disagreement 


y PLO 


ee 
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sight was defined as follows: “An insightful 
person has the ability to recognize and under- 
stand the motives underlying her behavior 
and is aware of the effects of her behavior on 
other persons. She is alert to what other peo- 
ple think of her as a person. (In making judg- 
ments, do not be influenced by intelligence, 
likeability, etc., which are not necessarily re- 
lated to insight. An unpleasant person, for 
example, still could be an insightful person.)” 
Approximately 60 per cent of the rating forms 
were returned. 


Results and Discussion 


Comparison of the scores of Ss at the ex- 
treme quartiles of the De distribution yielded 
the results summarized in Table 1. 


Accuracy Scores 


The finding of chief interest is the relative 
inaccuracy of the high De scorers in predict- 
ing how they will be described by their peers. 
The difference is significant beyond the .001 
level. Even if the highest quartile is com- 
pared with the remainder of the sample in a 
one-tailed test the difference between means 
on Predictive Accuracy yields a critical ratio 
of 2.72, significant at .003. The correlation 
between Predictive Accuracy and De for the 
entire sample is — .41. 

Several factors which might attenuate the 
inference to be made from this finding were 
examined. 

The task of describing another person may 
be restricted by certain unverbalized conven- 
tions. Some characteristics may be assumed 
by most members of a group to be properties 
of most other members of the group and of 
themselves. In the present sample, for exam- 
ple, the adjective “intelligent” was generously 
used. A spuriously high accuracy score may 
be obtained by the S who is not responding 
to the particular members of her group of 
four, but who is accurately gauging the prob- 
ability of an adjective being checked plus or 
minus when any S was describing any other S. 

In this regard, it is interesting that the Self- 
peer Correspondence score also shows the high 
De scorers differing significantly from the low 
scorers. Correlation between the two scores 
for the entire sample is — ;29. Apparently, 


Charles F. Reed and Carlos A. Cuadra 


the difference between the extreme groups is 
not solely a matter of anticipating the verb 
responses of peers. The high De’s do not just 4 
mis-guess the descriptions that will be made 
of them. Their “private” self-descriptions dif- 
fer more from the descriptions made by their 
peers than do the self-descriptions of the low 
De’s from their peer-descriptions. “f 

To investigate further the possible effects of ; 
the popularity or unpopularity of adjectives 
a Conventional-word Count was made. Each 
adjective was ranked according to frequency | 
of use in the peer-descriptions (Checklists ort 
Task II) and assigned a corresponding SCO% 
ing weight. Self-descriptions (I) and Pr 
dicted-descriptions (III) were scored for the 
sum of the weighted scores of adjectives 
marked with a plus. A high score indicates 
that S marked adjectives in the directio? 
which conformed to the group’s convention@ 
usage. 

_High De’s marked adjectives less conve?” 
tionally, both in the prediction task and in 4% 
scribing themselves. Conventional-word Cou? 
on Task III correlates .35 with Predictive E 
curacy for the total sample. When Conve 5 
tional-word Count is held constant, the corre 
lation between De and Predictive Accuracy | 
drops from — .41 to — .31, a coefficient E! of E 
significantly different from zero at the“. 
level. Conventionality of description—if i 
to be construed as an attenuating factors 
fails to delete the differences in accuracy p 
tween the De extremes. pet j 

Stereotypy is a score based on the num e 
of adjectives which an S checked in the Ser 
direction in describing all three peers "ate 
group. A high score would seem to C, yy, 
either a failure to discriminate or 4 h 
homogeneous group. High De scorers earn Jy 
nificantly lower scores. This score may 5 pt 
be another reflection of venturing fro™ 
lar responses. use” 

The adjective checklists were scored for pe 
of favorable and unfavorable items © , of 
basis of the ratings procured from 4 gro oth 
Š „deser ae 
judges by Gough (4). Favorable Self- CF yo 
tion refers to the relative emphasis on 


avei 
: rA i 
able adjectives in describing self. cs rel? 
Expected Peer-description refers tO in Te 
tive emphasis on favorable adjectives tes H 
III; Favorable Peer-description ind? 
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relative representation of favorable adjectives 
ascribed to S by her peers. 

The results summarized in Table 1 indicate 
that while high De scorers described them- 

. Selves in less favorable terms and expected to 
be described in less favorable terms than did 
the low scorers, the descriptions which were 
made of them were not less favorable. 

A final score, Anticipated Disagreement, is 
the sum of adjectives given opposite signs by 
an S in I and in III. A high score may be inter- 
Preted as indicative of an expectation that the 
Sroup will differ from the S’s seli-description. 

igh De scorers expect to be misdescribed or 
Perhaps misunderstood. This is not the same 
Phenomenon of describing self in unfavorable 
{erms and expecting to be described in un- 

avorable terms, since this score is based on 
adjectives which differ in sign between self- 
i expected-description. This apparent psy- 
tes ogical isolation may be related to charac- 

a istics Gough found in high scorers: “feelings 
ae SPondency and alienation, lack of con- 

ence in self and others” (7, P- 209). 


Péer-nomination Findings 
ominations for insightfulness were tallied 
line S and converted into T scores. The 
ils. 8s for insight for the Ss in extreme quar- 
enc on the De distribution showed a differ- 
| Je significant beyond the .02 Jevel, Ss with 
De scores being rated less insightful. 


Co À 
Sop of Adjectives 

Co ne following adjectives are assoc t 

tribo in the extreme quartiles of the De dis- 
ution of the sample. Adjectives discrimi- 


: ae’ at the .01 level are listed before the 
Psis, those at the .05 level after the ellipsis. 


jated with 


High De Student Nurses 
Descri n 
ibed = -mi com: licated, for- 
Betfu] self as: Absent-minded, eel tae 


s drawn headstrong, indifferent, tempera) 

` W g, indifferent, empi 
De ++. flirtatious, pleasure-seeking, quarrelsome 
disp bed by peers as: No adjectives significantly 
4 inating, e 
sent-minded, 


Dect, 

chan Petted to be described as: Absent 

i gaas le, confused, forgetful, indifferent, irritable, 
Lo, "estless, temperamental . - - 

y °” De 

} D 

i Scribeq sel . 

i f as: Confident . . Zot 


, Descril e, 1 
Nictimina i Y peers as: No adjectives sign! 
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Expected to be described as: Ambitious, confi- 

dent ... ‘ 
Navy Students 
High De 

Described self as: No adjectives significantly dis- 
criminating. 

Described by peers as: Confused, impatient, lazy 
_.. dreamy, immature, interests narrow, irrespon- 
sible, self-centered, tactless, careless, foolish, impul- 
sive. 

Expected to be described as: Indifferent . . . con- 
fused, high-strung, immature, impatient, reckless. 


Low De 

Described self as: Conscientious, contented, coop- 
erative, gentle, mild, practical, praising, peaceable, 
pleasant, relaxed, reliable, responsible, steady, tactful, 
thoughtful . . - clear-thinking, conservative, conven- 
tional, capable, cautious, efficient, dependable, mod- 
erate, jolly, kind, mannerly, mature, opportunistic, 
organized, reasonable, resourceful, self-confident, self- 
controlled, simple, stable, sympathetic, thorough, 


wholesome. , i 
Described by peers as: Conservative, cooperative, 
helpful, thoughtful, wise 


capable, forgiving, generous, ; 
.. alert, clear-thinking, clever, calm, cautious, easy- 


going, foresighted, frank, initiative, moderate, modest, 
obliging, kind, mannerly, mature, practical, patient, 
peaceable, poised, progressive, realistic, | reasonable, 
relaxed, responsible, serious, steady, trusting. ; 
described as: Clear-thinking, ca- 


Expected to be 1 
sake, dependable, patient, peaceable, reliable . . . 
conservative, cooperative, cautious, honest, planful, 
reasonable, stable, steady, understanding. 

The Navy sample was used in a pilot study 
and some differences between Ss and pro- 
cedures prevent direct comparison of the ad- 
jectival content. The Navy sample consisted 
of 100 Ss, male and female. They represented 


approximately the same age range as the stu- 


rses (17-21). 

ee qualities of indifference and 
impulsivity appeared in the peer-descriptions 
of both samples. In the Navy group, some of 
the adjectives which distinguish the expecta- 
tions of the high scorers find verification in 
the peer-descriptions, e.g. confused, m 
mature,” and “impatient.” A correct appii i 
of group opinion was made in regard to m 
characteristics at least. Although the adjec- 

d in this study has a 


ive checklist employe¢ 
E nge of description, there may be other 


Is on which the high De scorers 
Jl. 
i Summary 
: to test the 

The purpose of the study ‘was 
nypothess upon which a scale for the detec- 


——=< 
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tion of potentially delinquent behavior had 
been based: that role-taking deficiency and 
social insensitivity were characteristic of psy- 
chopaths. More properly, the investigation 
concerned the association of social insensi- 
tivity with high scores on the scale. The scale 
had been demonstrated to possess satisfactory 
screening efficiency, but the operation of the 
assumed variable, it was suggested, had not 
been demonstrated. 

A total of 204 normal female Ss used ad- 
jective checklists to describe themselves, de- 
scribed three other acquaintances specified by 
the investigators, and predicted how they 
themselves would be described in the com- 
posite checklists of the group. A second source 
of data was obtained by using the peer-nomi- 
nation technique for designating low and high 
“insight” Ss. 

Subjects from the extreme quartiles of the 
Delinquency (De) Scale distribution were 
compared for predictive accuracy and other 
variables with the following results and con- 
clusions: 

1, Subjects who score high on De are sig- 
nificantly less accurate in predicting how they 
will be described by others than are Ss who 
score low. 

2. When peert-nomination ratings of insight 
were compared with De scores, high De scorers 
were rated significantly less insightful than 
low scorers. 

3. High De scorers expect to be described 
by peers in unfavorable terms, and so de- 
scribe themselves. This expectation is not sup- 
ported in the descriptions which are actually 
made by the peers. 

4. High De scorers tend to use relatively 
fewer adjectives as they are used by the sam- 
ple as a whole. Even with conventionality of 
description held constant, however, they are 


Charles F. Reed and Carlos A. Cuadra 


' 


poorer in predictive accuracy than are other 
members of the sample. 

5. High De scorers seem to expect to be 
misunderstood by their peers. 

6. High De scorers in a pilot study charac- 
teristically expected to be described as COD” 
fused, immature, and impatient, and were de- 
scribed in these terms by their peers. 

If incapacity in role-taking implies 4 rela- | 
tive inability to understand and predict ones 
own social stimulus value in a particular set- 
ting, the findings of this study support indi- 
rectly the theoretical assumption upon whi 
the De scale is based. The scale itself appa" 
ently discriminates between Ss on their ability 
to “see themselves as others see them.” 


Received January 15, 1957. 
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The infi 

a uence of “defensiveness” and its 
Personalit Plus-getting,” on objective tests of 
Ways, a. been dealt with in a number 
Measuring th of these is the devising of scales 
effect on test extent of the behavior and its 
Fen used est scores. Several methods have 
(7), for to this end. Hartshorne and May 
Socially tem aes invented items expressing 
ably Would esirable attitudes which presum- 
Re The Z be owned to by most honest peo- 
ersonalit scale of the Minnesota Multiphasic 
example y Inventory (MMPI) is a current 

i nother this procedure. 
S resp technique involves th 
ent ses of normals obtaini 
€S. Tte esponses of deviants with normal 
ms discriminating between these 


Sto, 
SCoy, 
Bto 
u 
T PS are 
he assumed to measure defensiveness. 
| 


e item analy- 


s 
ng normal 


Ere, 

y a 

E erived part of the K scale of the MMPI 

qui Proce S this manner (8). 

Pete ce introno Or E (9) re- 
US Cofe playing on the part of subjects. 

e s Chance, and Judson (1) com- 

tion, ken esponses of college stu 

dees Wit 

S X those of students given in 

RA lead to “faking bad” 0 

OSitiy, analysis yielded a scale measur- 

Atiatign  Palingering” (defensiveness). In 

Edwards 

ructed to 


the MMPI under normal condi- 
structions 


r “faking 


m lation 
of 
ches ved 


fa 


len „this procedure, 

o a E who were inst! 

jugy Pl o ially desirable response tO eac 

Meg S ark MPI items. Those on W. ich all 
Suri were used as a preliminary scale 


Wit Tin, y 
ay: he raed desirability effects. Results 
“ n The rument are reported by Fordyce 
Odie Pr 
ang i cation t method is in some ways a 
toy, May. It of that proposed by Hartshorne 
we S rationale first will be describe 


Y results obtained with an applica- 


Michigan State University x 


IMPI. It is hoped that the pro- 
with any personality test 
bly large number of true- 


tion to the M 
cedure will be useful 
containing a reasona 
false items. 
Rationale of Item Selection 
er of studies have shown a high 
between the social desirability of 
essed by personality test items 
bility that the items will be en- 


iven in an inventory. Edwards 
ts a correlation of .87 


A numb 
correlation 
attitudes expr 
and the proba 
dorsed when §' 
(2), for example, repor 
between endorsement and desirability using a 
preliminary true-false version of his Person- 
ality Preference Schedule. Social desirability 
of items was rated by judges; probability of 
t was determined from relative 
of “true” responses when the in- 

der the usual 
ilar outcomes have been 
m certain MMPI 


frequency 
ventory was giv 
testing conditions. Sim 
obtained using items fro 
les (6). 

ee Dies and others reveal that normal 

d to endorse socially desirable 

ject undesirable items when tak- 
ity tests. While it is reasonable 


to assume that in many cases the endorse- 
ments and rejections represent the true state 
of affairs, dual Jacking these attitudes 
can obtain a normal score if his answers to 
determined by social desirability 
ersonal relevance. Similarly, & 
1 individual may obtain a devi 
b endorsing ially undesirable items, and 
jecti i ms. 
rejecting stent elationship be- 
: and endorsement can be 
jon of items, it is pos- 


an indivi 


twe 
ipulated by select: ; 

oe ate conditions under which defen- 

i or both; might be Te- 


ji s-getting, 


d on validating 5°4 es. 


391 


392 


If with nondefensive subjects the desir- 
ability of items is negatively correlated with 
endorsement, a scale measuring defensiveness 
can be constructed by keying endorsement of 
desirable and rejection of undesirable items. 
When this key is used with ordinary samples, 
most subjects will obtain low scores, and high 
scores will be indicative of defensiveness. 
Items meeting this requirement unfortunately 
are uncommon. 

To measure plus-getting there should be a 
positive correlation between desirability and 
probability of endorsement when the latter is 
based on responses of nonmalingering sub- 
jects. Endorsement of undesirable and rejec- 
tion of desirable items would be keyed in this 
case, and high scores will reveal malingering 
when the scale is administered to the usual 

pee one scales, however, 

may mee i iption: 
high scores a be ie Aloe to ol tinis 
: MG malingering 
or genuine deviation. A scale of this kind 
therefore, will be useful only if its į 2a 
nondiagnostic. It is not likely $ aa items are 
gator deliberately would ee 

E : A ely i include many such 

items in his original item pool. 

A compromise approach is more practical. A 
scale in which desirability and endorsement 
are unrelated, when responses are honest, can 
measure both defensiveness and plus-getting. 
If endorsement of desirable and rejection of 
undesirable items are keyed, defensive indi- 
viduals will obtain high scores and plus-get- 
ting subjects low scores. Intermediate scores 
presumably reflect honest (personally rele- 
vant) answers. Since the correlations so far 
reported between endorsement and desirability 
although large are not perfect, it is possible, 
in principle, to locate items, clearly desirable 
or undesirable, constituting a scale in which 
this correlation is minimized. Even with Ed- 
wards’ correlation of .87, inspection of his 
scattergram (2) reveals that a short defen- 
siveness scale might be derived from his items. 

The nondefensive, non-plus-getting group 
discussed above is hypothetical rather than 
real. It is likely that in most samples there 
will be some defensive and a few plus-getting 
individuals. A scale derived from an honest 
group will show a positive correlation be- 
tween desirability and endorsement of items 
when used with the typical sample. The more 
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defensive the subjects, the higher this corte 
lation will be. If it is assumed, however, that 
such individuals are in the minority, then thè 
correlation will be small. When, in practices 
endorsement values are computed from rè 
sponses of a mixed group of presumably hon 
est, defensive, and plus-getting subjects, t° 
items may be chosen so as to permit a sma; 
Positive correlation between desirability a” 
endorsement. ; 
The assumption that honest subjects are ™ 
the majority leads to a deduction regardin 
the internal consistency of the scale. In the 
hypothetical honest sample the scale ought 1 
have zero reliability, i.e., the items should nog 
correlate with each other. In a sample of A 
fensive and plus-getting subjects, on the otha 
hand, the internal consistency of the scra 
should be large; the items ought to have ni 
intercorrelations. In a mixed group with © 
majority of subjects honest, the interna 
sistency of the test will be smaller than we 
usually required for useful measurement. hat 
example may clarify this point. Suppose : 20 
the scale has a Kuder-Richardson formula n° 
reliability of .80 when given to a mixed A 
ple of 200 defensive and plus-getting subir is 
Suppose, further, that when the same SC, : 
administered to 800 honest subjects, the ; c 
20 reliability is zero. If these 1,000 sub)?” 
are pooled and the reliability recompute 10" 
will be found to be approximately .371 © js 
vided the proportion endorsing each } 
the same in both subgroups. In this erar u 
a potentially highly reliable validating °° of 
ment will have low reliability when M% oy 
the subjects are answering on the basis sty, 
sonal relevance rather than social desira ; : 
honest answers contributing only error to ale 
validating scale variance. When such 2° iw 
used with ordinary samples shows Er ithe 
ternal consistency, it might be surmise f ot 
that a large proportion of the subjects abli 
answering honestly, or that the scale tensi” 
measures something in addition to 
ness and plus-getting. 
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An Application to the MMPI 


Derivation of the Scale jude 


Choice of items will be best wpe io 
ments of social desirability are ava! 
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tei dk im in an inventory. In applying the 
this to the MMPI, the length of the in- 
For a presents a formidable rating problem. 
desirabil — items presented to judges for 
manner ity ratings were selected in such a 
a to reduce the correlation between 
wining aent and desirability. One way to 
ables = the correlation between two vari- 
Since y to reduce either or both variances. 
Neutral ariation in desirability is necessary, 
sivenes items not being scorable for defen- 
Ment a restriction of variance in endorse- 
tated hi the feasible alternative. The items 
dorsed 3 the judges, therefore, were those en- 
normati y 36 to 64 per cent of Hathaway's 
here ive group of college males and females.* 
Sether were 53 items of this kind. These, to- 
by an A ten marker items previously rated 
nine-poj er group of judges, were rated ona 
Were scale of social desirability. Judges 
troduct male and 53 female students in m- 
State es psychology classes at Michigan 
in an e niversity, Instructions were those used 
Ah aie study (6). The social desirability 
em was found by computing its median 

© rating scale. 
opa ne items received ratings 
tan e to those found in the earlier study. 
Undesirable of four or less were taken as 
Steatey desirable, Twen neutral, and six OF 
fell eit as desirable. Twenty-four of the items 
= Bories into the desirable or undesirable 
ed ur, To these were added two items 
y undesirable but endorsed by the ma- 
26-ite of subjects in the earlier study. The 
m scale thus formed was keye for 


aBre 

desirat eat with desirable and rejection of un- 

M le items. The items keye “true” are 
160, 228, 248, 


264, en Booklet Nos.) 79, 111, 160; ; 
ate; ty 461, and 468. Those keyed “false 
324, 3390 71, 109, 124, 135, 142, 148, 170, 
lnteresa 406, 408, 409, 416, 439, aP . 
the x 'N8ly enough, 12 of these appear on 
on q, Scale, but the scoring of three of them 
S Xperimental scale is opposite t that 

K. Three also are on the L scale of 


quite 


ty and 


S r f i “ye. 
lationship between desirabili 
menta! 


m . 
ent for the 26-item experi 


because NO 
s who 
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scale was determined by correlating desirabil- 
ity ratings with endorsement values computed 
from Hathaway’s normative college data. The 
product-moment coefficient is .27. When en- 
dorsement based on responses of 59 male and 
41 female introductory psychology students to 
the booklet form of the MMPI is substituted 
for the normative group values, the correla- 
tion is .31.2 Thus, the scale appears suitable 
as a measure of defensiveness and plus- 


getting. 


Reliability 

The internal consistency of the experimen- 
tal scale was computed using K-R 20 and 
protocols of the same 59 male and 41 female 
students. The reliability coefficient is .48. A 
similar coefficient was computed for K and 
found to be .73. The moderate internal con- 
sistency of the experimental scale, however, 
is in line with the assumption that honest re- 
sponses predominate in typical groups. If the 
assumption is correct, the higher reliability of 
K is due to the operation of some additional 
variable or variables not measured by the ex- 


perimental scale.$ 


Validity 
Only indirect evidence of the validity of the 
e was obtained. Since the K 


experimental scal : 
scale of the MMPI has been the subject of 
much study in this regard, with some evidence 
o its validity, it was decided to com- 


ointing t c 
ee the experimental scale and K with re- 
Table 1 
Raw Score Correlations Among Measures of 

g Defensiveness on the MMPI* 
Scale 

Scale L K Ex SD Cof 
— 40 57 35 AQ 

A — .64 .73 39 

5 = 69 43 

= , =e o 

nt at the .05 level. 


a = 100; any of .20 is significa: 
d to Dr. Frederick C. 


2 r is indebte 
ie Ers ilable these MMPI protocols 
Ri ther investigation. > 


collected jn ano 
3 A variable © 
. 


“acquiescence 


e the response set 
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Table 2 
Raw Score Correlations of the Experimental Scale (Ex), and K with MMPI Scales” 
Scale 
Scale F Hs D Hy Pd Pa Pt Se Ma 
Ex —.37 — 44 —45 -00 —.40 -09 —.65 —.56 —.35 
K —.36 —34 -28 415 -—24 07 -6 — 5g  —40 


aN = 100; anr of .20 is significant at the .05 level. 


spect to their relationship to the common 
diagnostic scales of the inventory and to other 
measures of defensiveness. Table 1 presents 
the correlations among L, K, Edwards’ recent 
revision of his scale (SD), the Cofer, Chance, 
and Judson scale (Cof), and the experimental 
scale (Ex). All correlations are positive, and, 
with the exception of the Cofer scale, high. It 
would be difficult to justify a choice between 
K, SD, and Ex on these data alone. 

If a validating scale is to improve predic- 
tion, it is necessary that it correlate with the 
predicting scales. Correlations between vali- 
dators and predictors, as the latter now are 
keyed, may be expected to be negative, ié., 
defensiveness will lower scores on predicting 
scales. The experimental scale and K are com- 
pared in this respect in Table 2. In Table 3 
may be found correlations with the “Obvi- 
ous” and “Subtle” keys Wiener (11) has de- 
veloped for several of the MMPI diagnostic 
scales. (The coefficients in Table 3 generally 
are higher than the corresponding values in 
Table 2, a function of the mixed nature of 
many MMPI diagnostic scales.) Again, it is 
clear that K and the experimental scale give 
similar results. In most cases the larger dif- 
ferences in Table 2 may be ascribed to differ- 
ences in item overlap. The extent of this over- 


lap and the manner in which it is scored aè 
shown in Table 4, similar material on the Ob- 
vious and Subtle keys being omitted to savè 
space. 

It would be useful if these correlations 
could be corrected for item overlap. Wheeler, 
Little, and Lehner (10) have prepared a table 
based on the independent-elements formu? 
for the Pearson coefficient that indicates the 
correlation to be expected because of item 
overlap between various MMPI scales, S 
though they reject its application to the pre 
ent kind of problem for psychological reaso”* 
Actually, the elements in their situation S 
test items, and these are not independent W gr 
scales have any internal consistency. To co 
rect the present coefficients properly for ra 
overlap would require information rega" ‘all 
the pattern of intercorrelations among be 
items on the scales, and these data woul is 
tedious indeed to extract. A direct metho” p 
necessary to eliminate the effects of item ore 
lap. Results from this procedure will be 
sented in a later section. tio? 

A shortcoming in the foregoing presenta g 
is the absence of material regarding CO™P™ yp 
tive scores of deviant and normal ba, 
may be speculated, however, that the Jess" 5gd 


$ d ; e 
overlap there is between a diagnostic 5¢ê 


Table 3 
Raw Score Correlations of the Experimental Scale (Ex) and K with the Obvious and 
Subtle Keys of the MMPI a 

Obvious Key Subtle Key ; 
Mi! 
Scale D Hy Pà a Ma ee 0? f 

Ex 6 —39 —ST —43 —50 a aa at A 

K —:50 —.55 —.60 —.49 —.59 44 69 34 
a Ņ = 100; an v of .20 is significant at the .05 level. 
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Table 4 


Item Overlap of the Experimental Scale (Ex) 
and K with MMPI Scales 


MMPI Scales 


Scale Scoring F Hs D Hy Pd Pa Pt Sc Ma 
Ee Same 001512002 
Opposite 00 4 £2 2 2 art 
K Same 606 0 7202 4 
Opposite 102 01020 1 


that lating scale, the less likely it will be 
In th © validating scale itself is diagnostic. 
Sible ent limited application, it was pos- 
Pareq © reduce the amount of overlap com- 

to that occurring in the K scale, as can 


€ seen in Table 4. 


The Effect of Acquiescence 


tien (5) has shown that his measure of 
With eScence, the set to respond systematically 
With true” or with “false,” correlates highly 
Peratig and attributed this correlation to the 
Bests on of the response set. He further sug- 
at correlations between K and other 

Scales arise from this set rather than 
, cefensiveness, According to Fricke, cor- 
Or acquiescence becomes important 
discrepancy between percentage true 
Is iercentage false in a key exceeds ee 
but Not the case with K and Edwards’ 5% 
With the experimental scale. The cor 
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relation of .64 between K and the experi- 
mental scale could have occurred because of 
defensiveness and plus-getting, and the cor- 
relations of similar magnitude Fricke reports 
between his measure, Se¢ T, and K could have 
arisen because of acquiescence. It is possible, 
however, that the experimental scale was in- 
fluenced to some extent by acquiescence, espe- 
cially when it is noted that the items were 
chosen from those whose probabilities of en- 
dorsement were close to .50. Items of this 
kind are held to be especially susceptible to 
the response set. 

To settle this problem, recourse was made 
to empirical evidence. The true-false differ- 
ence in percentage of keyed responses was set 
at zero by eliminating eight of the experimen- 
tal items keyed false (Nos. 15, 135, 142, 148, 
324, 383, 409, and 444). Remaining were 18 
items, nine true and nine false. These were 
keyed as before for a measure of defensiveness 
(Sx). The same 18 items were rescored, true 


nses being keyed. Scored in this manner, 


at ” score (AT), a 


the items yielded an “all true 
re of acquiescence. 
ie etn between Sx and AT in the 
59 males and 41 females is — .11. Thus, as 
s the scores represent the effects of indi- 
vidual differences in defensiveness and in ac- 
quiescence, the two appear independent. The 
K-R 20 reliability of Sx is .31, while that of 
AT is .32. In spite of the small number of 
items, both ways of scoring give measures with 
internal consistency. 
By ARE both scales with MMPI 


far a 


Table 5 
ith MMPI Scales— 
? imental (Sx) and AT Keys wil ' 
Scale 

t Sc Ma 

= E H: D Epa Pa l Pate SF 
S 

- —.50 —.27 — a D 38 
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diagnostic and validating scales, it was pos- 
sible to arrive at an idea of the effects of so- 
cial desirability and acquiescence. The corre- 
lations with the diagnostic scales are shown 
in Table 5. These coefficients are free from 
item overlap; the common items have been 
deleted from the diagnostic scales. If the two 
scorings of the experimental scale are accepted 
as valid, both acquiescence and social desir- 
ability affect scores on most of the MMPI 
scales. This is especially clear when one con- 
siders the Obvious and Subtle keys and those 
diagnostic scales for which no such keys have 
been developed (because of lack of subtle 
items). 

Ten of the 18 coefficients in Table 5 in- 
volving the AT scale are statistically signifi- 
cant. Fricke (5) suggests that a true-false 
discrepancy of 40 per cent or more is a cri- 
terion of susceptibility to acquiescence. In 
ten of the 18 correlations, the true-false dis- 
crepancy of the MMPI scale exceeds this 
value. Nine of the ten significant coefficients 
occur with these susceptible scales, and in all 
instances the signs of the coefficients are in 
the proper direction. Thus, for example, the 
Pt scale has many more true than false items: 
highly acquiescent subjects would be expected 
to receive high Pé scores, and the empirical 
correlation of .26 between Pt and AT sup- 
ports this prediction. 

While acquiescence plays its role, social de- 
sirability is at least as important, if not more 
so, in the present case. All Obvious scales 
show negative correlations with Sx. Its cor- 
relations with the Subtle scales are smaller 
and positive, suggesting that in some cases 
subtle items are not necessarily free from so- 
cial desirability.* 

The K scale was explored in similar fashion. 
The elimination of item overlap between K 
and the 18 items required pruning eight items 
from K, but in spite of this drastic operation, 
correlations remain with the two keys. The 
correlation between K and Sx is .48, while 
that between K and AT is — .39, both sig- 
nificant at the .01 level. Edwards’ revised 
scale also was examined in this manner, the 
single overlapping item being dropped from 

SD. The correlation between Sx and SD is 


#An item may be subtle on one scale but obvious 
on another. This is not true with respect to its social 
desirability. 
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.55, between AT and SD, — .24. The better 
balance of true and false items on SD, 4 
compared to K, appears to reduce the effects 
of acquiescence. 


Discussion 


The results of this study demonstrate the 
feasibility of constructing a measure of defen- 
siveness and plus-getting, that involves rather 
less time and expense than usually is require 
with other methods of item selection. It ® 
not proposed in this paper, however, to 0 i: 
the experimental scale as a substitute for 0Y 
MMPI validating variable, nor is it suggest® 
that the validating arsenal of the psychologist 
has been increased. The MMPI study W% 
undertaken primarily to verify a line of pe 
soning. The method described appears tO Mar 
when applied to the MMPI, and it might | 
expected to be useful when employed va 
similar kinds of inventories. 

A by-product of the study, nevertheles 
sheds light on the nature of K and ot ‘al 
MMPI scales. Both acquiescence and soca 
desirability play their parts, hence so 
qualification of Fricke’s (5) results should a 
offered. While Fricke selected his items 5° 5 
to measure acquiescence, it is not noW ity 
if these items are neutral in social desirabl? d 
or whether defensiveness and plus-getting 
fect scores on his measure. i 

In the present study, the scoring of m de 
scale is such that individuals responding S% 
fensively obtain average scores, while i ate 
scoring gives average scores to the highly ge 
quiescent. This is possible because the ave 
item on the experimental scale is endorse avd 
nearly 50 per cent of the sample. The A- ved 
Sx scales correlate negligibly; thus mer de 
acquiescence is independent of measte” „pd 
fensiveness. It might be that acquiescen pu 
defensiveness, as measured by ideal "per 
ments, would correlate significantly)” a0 
might, for example, be a tendency Or el 
quiescent individuals to be defensive as othe 
The fact that both scales correlate wit en, 
variables in the proper direction, howe” seit | 
dicates that the proposed method 35 gure a 
The investigator who develops & mer n 
defensiveness by applying the piesei pen 
cedure could utilize the same items 1 ould 5 
ure of acquiescence. Both keys the? “he si 
used to obtain information regarding 


r- 
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ce ay of his other scales to these diag- 

‘leven, y extraneous variables. In addition, 

Has ie to outside criteria may reveal, as 

sale T been suggested, that the validating 

rather aa lyas tap personality dimensions 
an temporary response sets. 


Summary 


od Study was devoted to testing the use- 
Measuri of a method for selection of items 
Bettin ing test-taking defensiveness and plus- 
e = It is Proposed that a scale in which 
With aa desirability of items is uncorrelated 
te eir Probability of endorsement can 
are ares validating measure when the items 
response, in terms of the socially desirable 
straa ted application to the MMPI demon- 
similar mit a 26-item experimental scale is 
tions in the K scale with respect to correla- 
iven the main diagnostic scales of the 
With Mi Significant correlations were found 
er measures of defensiveness. 
for the experimental scale was corrected 
that bo acquiescence response set, it was found 
be mes th acquiescence and defensiveness could 
jam oe by appropriate keying of the 
eys wi of items. Correlations of these two 
both ith other MMPI scales indicated that 
to th, EDsiveness and acquiescence contribute 
Variance on the diagnostic measures. 


Eceiy, 
ed December 18, 1956. 
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Extraversion, Neuroticism, and Manifest Anxiety’ 


A. W. Bendig 


University of Pittsburgh 


One of the most recently developed tem- 
perament inventories is the Maudsley Per- 
sonality Inventory (1, 2). This inventory 
contains 48 trichotomously scored items 
(““Yes-?-No”) with 24 items scored for the 
two scales included in the MPI: Extraversion 
and Neuroticism. Eysenck reported (2) that 
the two scales and split-half reliabilities of .77 
and .88 and an intercorrelation of — .05 (N = 
400) for his normative group and an intercor- 
relation of .12 for another smaller group (V = 
50). Eysenck has hypothesized (1, p. 50) that 
Taylor’s Manifest Anxiety Scale (4) should 
show a strong positive correlation with the 
MPI Neuroticism scale and a smaller nega- 
tive correlation with the Extraversion scale. 
The present research was designed to inde- 
pendently assess the internal consistency reli- 
ability of the MAS and MPI scales, and also 
to provide evidence as to the interrelation- 
ships among these three scales. 

The MAS and MPI were administered to 
college Ss enrolled in a one-semester introduc- 
tory psychology course. The sample for the 
reliability study consisted of 145 Ss (100 men 
and 45 women), while the sample used in 
finding scale intercorrelations included 254 Ss 
(210 men and 44 women). The sex groups 
were so similar in scale means, variances, reli- 
abilities, and intercorrelations that only re- 


1An extended report of this study may be ob- 
tained without charge from A. W. Bendig, Dept. of 
Psychology, University of Pittsburgh, Pittsburgh 13, 
Pa., or for a fee from the American Documentation 
” 


Institute, Order Document No. 5313, remitting $1.25 
for microfilm or $1.25 for photocopies. 


sults from the combined sex groups are T 
ported here. ? d 
Reliability of the MAS scale was estima’ 
by Kuder-Richardson Formula 20, while it 
variation of the same formula for use wie 
trichotomous items (3) was used with fa 
MPI scales. The reliability coefficients for 
Anxiety, Extraversion, and Neuroticism ee 
were .78, .74, and .84 (N = 145). The in 


. ‘ t é 
correlations among the scales were: Au 
i n : sr oti 
Extraversion, —.35; Anxiety-Neur 20. A! 


.77; and Extraversion-Neuroticism, — - ta 
three of these coefficients are significa? 
the .01 level (WV = 254). very 
The reliabilities of the MPI scales arè Vg 
similar to those reported by Eysenck G pility 
are about the same size as the MAS relia ne 
although the MPI scales contain only ‘The 
half of the number of items on the MAS. nck 
scale intercorrelations also confirm Eys pe 
(1) although the intercorrelation betwee os 
two MPI scales is slightly higher than 
he reports. 


gs 


Brief Report. 
Received May 6, 1957. 


References d 


soty OF 
1. Eysenck, H. J. A dynamic theory of angie? 
hysteria. J. ment. Sci, 1955, 101, 28° zr 
2. Eysenck, H. J. Reminiscence, drive, aR 
ality theory. J. abnorm. soc. Psyen®™” J 
53, 328-333. pichardi, f 
3. Ferguson, G. A. A note on the Kuder- 19549 1 
formula. Educ. psychol. Measmt,, 
612-615. f ma” 
4. Taylor, Janet A. A personality scale ° 1953 
anxiety. J. abnorm. soc. Psycholy 
285-290. 


398 


Journal g 
Vol. 21, We s sellier Psychology 


its 
Falsi ; 

nome ication of response has long been 
ited the Seer an important factor which lim- 
Or questi idity of the personality inventory 
has ed (1, p. 491), and as Gough 
any aes A recurring problem in the use 
Sembling” sonality test is the question of dis- 
Phasic p (3, Ds 408). The Minnesota Multi- 
ains at > eee Inventory (MMPI) con- 
Which qj ast four separate scales (?, Z, K, F) 
+ Validity T ctly contribute to evaluating the 
BE oe the scores in the profile. Each one 
Somewh, four scales assesses validity from @ 
the on at different approach, but apparently 
ante is gp omising combination of scales to 

evera] e's F minus K (3). 
the ee empirical studies have reported on 
SUbtract eo ulness of the values obtained by 
Score g the K raw score from the F raw 
Speciali, 2, 3, 4, 5). These studies have been 
T m Ce semed with establishing opti- 
Ph Ponts in the F — K distributions. 
; tibutions that have been studied and 
cl Were obtained from subjects under 
: Conditions, e.g., (a) «normal” sub- 
“On cee were given no particular instruc- 
ee to effect F—K, (4) norma 
l mality who were instructed to feign abnor- 
act me tO) normals who were instructed to 


analogous 
normal” 


most rewarding results have been 
ew condition (5). These studies 
> Stated. ell replicated and cutoff scores 
files “S0lved | The present study focuses on 
ie ich he reverse problem of detecting pro- 
Pression „pave been faked to make a “goo 
ofp An p 
ae = anpi to discover what precis 
Subject, ues might be used to discri 
S who were instructed to fake 


la 


e range 
minate 
a 
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F-K in a Motivated Group 


| James Drasgow and W. Leslie Barnette, Jr. 
i University of Buffalo 


“good” normal profile, Hunt (4) concluded 
that more research was needed. Cofer (1) 
also was unable to find discriminating F — K 
values; Gough (3), re-working Cofer’s data, 
reiterated Cofer’s conclusion. 

Why is it that the faked-good profiles have 
been so difficult to detect? In previous studies 
with subjects working under fake-good in- 
structions, their motivation is open to ques- 
The choice of motivated subjects by 
chers appears to have been un- 
the probability of finding real 
differences has been minimized by supplying 
only instructions to stimulate motivation. The 
present brief report has therefore focused on 
this aspect and supplies a group with higher 


motivation. 


tion. 
previous resear 
fortunate since 


Subjects 


The University’s Vocational Counseling Cen- 
tre provides a job applicant screening service 
to business and industry. The job applicants 
from this service formed the group with which 
we worked. Some of the applicants were 
applying for jobs with companies without 
having been previously associated with the 
company, while others were old-line company 
employees competing for promotion to a 
choice spot. All Ss were employed males on 
their “old” jobs at the time of testing. The 
jobs for which they were being tested in- 
cluded such titles as foreman, salesman, su- 
pervisor, superintendent, and vice-president. 
The jobs can be seen in a framework of ad- 


hat in this so- 

ement and betterment so t s 
baie reasonably infer an appreciable 
e of motivation to get the better job. 
id that they had been 


job in question and it 


f “industrial cases” with 


was no 


The total number ©: 


400 


MMPIs available for use in the present study 
was 92. Within this total pool, 66 profiles had 
scores within the normal range (T = 30 to 
70), and 26 profiles had one or more scores 
outside. The normal sample of 66 cases is uti- 
lized here, the remaining 26 being held for 
separate study pending accumulation of more 
cases. 

The mean age of the “normal” job applicant 
was 34 +7 with a range from 20 to 55. Sev- 
enty per cent of the group was married; the 
families averaged two children with a range 
from zero to four. The mean years of educa- 
tion was 142 with a range from 8 to 17. 
Fifty-seven per cent of the sample had at- 
tended college; 29% were college graduates. 

The modal person in the sample was a 34- 
year-old white married male with two children 
and two years of college. He was currently 
employed, but trying to get a “better” job. 


Results and Discussion 


All scores on all MMPI scales on the pro- 
file were within the accepted normal range as 
stated earlier. The mean raw F was 1.6 and 
the SD was 1.5; the mean raw K was 17.6 
with SD 3.2. The difference of — 16 for F — 
K is well beyond the .01 level. Table 1 pre- 
sents the distribution for this normal sample. 

Hunt (4) reported a mean F — K of — 11 
for the group of navy prisoners who were 
asked to make a good impression, but he was 
dissatisfied with this statistic because too 
many normals also gave this value. One might 
then expect an F — K of this size as an in- 
dication of a “normal” amount of hypocrisy 
which may be associated with making a good 
impression in this society. Gough (3) gives 


Table 1 
F—K Distribution from “Normal” MMPIs 

F-K N 
= 18, — 9 2 
—10, -11 4 
—12, —13 10 
—14, —15 13 
—16, —17 14 
—18, —19 10 
—20, —21 10 
0, —23 3 
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—7 to — 10 as a modal range within which 
the majority of normals would fall. 

The F — K of — 16 found among the job 
applicants is indirectly corroborated by 2% 
earlier study of MacLean et al. (5) with 
nurse candidates in a selection situation. 
Scores greater than — 17 were reported as the 
probably fake-goods. However, their sample 
was entirely female and they cautioned against 
generalizing to dissimilar groups. Judging from 
the similarity of the mean F — K found i” 
their study and in the present one, the differ- 
ence between male and female is quite Jes 
than obviously exists in other areas. J 

A further corroborating factor and potential 
source of explanation for the obtained resu 
appeared in the relationship between 4 10 
applicant’s F — K and his number of depent” 
ents. Because of the restriction in the range 
of the number of dependents and the non- 
normal nature of the distribution, 2 nonpa"? 
metric correlation technique was use 
timate the association (cf. 6, pp- 202-210): 
The correlation was .61 and significant be 


terest to report that the correlation n 
F — K and age was zero, while that bewen 
F — K and education was — .18 (Pearson? 
r’s in both instances). 

The relatively high relationship iD a 
by the .61 could probably be interpreted iP it 
variety of ways. The writers would re'a of 
to the American middle class value of UP wie 
social mobility. We postulate that mee 1 be 
dependents the client has, the greater m s? 
his felt personal responsibility and that, 4 ‘a 
partial consequence, the more motivate 
will be to make a good impression go: 4 jt 
secure the proposed upgrading on the 19° ot 


dicated 


is unfortunate that this statistical test ar al 
eee from the data of MacLea? 
Summary ip 


Other MMPI studies involving F me {0 
samples where testees have been redues f i” 
fake good are criticized on the grou” Re. 
adequate motivation or felt responsibility spt 
sults are presented, utilizing 66 nor sg 
profiles obtained from clients tested 
grading where evidence was available 
motivation. The mean F — index 


3 


a aN 
. — 


> 
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g 
cca? — 16. Age and years of education 
ie. e or no effect; number of dependents, 
ides T „Was significantly related to this 
ity an d t iš proposed that the felt responsibil- 
the Rao amr A amne os 

actors in producing such ele- 
vated F — K indices. — j 
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The Edwards Personal Preference Schedule 
and Social Desirability 


Robert E. Silverman 
New York University 


One of the major sources of error in per- 
sonality testing is the test-taking attitude of 
the subject. This variable is most intrusive in 
tests which measure independent personality 
factors. If the subject is resistant against re- 
vealing any psychological weaknesses, this re- 
sistance will confound the entire scale. One 
of the features of the recently developed Ed- 
wards Personal Preference Schedule (PPS) 
(1) is the attempt to control test-taking atti- 
tude by employing a forced-choice technique 
in which persons are asked to choose between 
items matched for social desirability value. 
Edwards (1) reported that this procedure 
was effective in minimizing the effects of so- 
cial desirability. He correlated the 15 test 
variables with two measures of social desir- 
ability, the K scale of the Minnesota Multi- 
phasic Personality Inventory (MMPI) (2), 
and a specially constructed social desirability 
scale based on items from the MMPI. In gen- 
eral, the correlations were low, indicating that 
social desirability was not a major factor in 
the PPS scores. 

Because of the importance of the social de- 
sirability variables, it was thought advisable 
to repeat the Edwards study using the K scale 
and a second independent measure of social 
desirability. This latter measure was based on 
the differences between scores on the Taylor 
Manifest Anxiety Scale (MAS) (4) and a 
forced-choice version of the MAS designed by 
Heineman (FC) (3). Tt was reasoned that a 
defensive test-taking attitude would manifest 
itself in terms of the differences between scores 
on the transparent “yes-No” inventory and 
the FC inventory in which precautions had 
been taken to minimize the effects of social 


desirability. 


Subjects and Procedure 


The total sample consisted of 147 male stu- 
dents enrolled in the general psychology cours? 
in the University College of Arts and Sciences 
of New York University. All tests were Ki 
ministered in the weekly discussion sections 4 
groups of 20 to 25 Ss. The Ss were tol Di 
the tests were a part of a research proje 
and that their cooperation was desired. 1” ‘| 
dition, they were informed that they cot 
discuss their test findings with a mem e; 
the psychology staff. > 

During the first testing session, 27 1 
tory consisting of the MAS and the 
was administered to the 147 Ss. TWO M 
later, 116 Ss were given the PPS, and ken 
weeks from the first test, 98 Ss who had ta at 
both tests were given the FC scale. Gi 
trition of Ss was due to class absences: 


Results and Discussion - ples: 
The correlations between the PPS v4" 
the measures of social desirability, 2" 4 fof 
anxiety scales for the present study able ye 
the Edwards study are presented in im 
The relationships in the two studies an lit 
lar. The correlations between the pan tm 
variables and the K scale are all 10W: ar? 
present study, four K-scale correlate g J 
significant at the .05 level or better: Se als 
these, Autonomy and Aggression, det 
significant in the Edwards study. 1? jatos, 
compare the two sets of K-scale OF 
they were treated as intensive Score? se 
signed ranks. The highest positive ee 
was given the rank one and the hig ok 
tive correlation the rank 15. The rane 
relationship between the findings = 0 
studies was + .65, significant at i 
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ae Table 1 
relat y i 
ations Between the PPS Variables, and Measures of Social Desirability and Anxiety 
Present study Edwards study 
eiO EAR 
PPS Vari Difference Forced i 

; iable E scale MAS choice K Sos MAS 

2 Achievement 08 _ 95% 

Be Oner 33** o FO “0 fae as 
4. Exhibie 23" -2  — 14 ‘16 ae 

S- Autonomy —.10 os 25" Lias 08 er 
6. Afiiiation a 7 Se -an ae -o 

i Intraception E 2 o 08 —.02 109 

s Sucr . 07 AS 16 .06 - 

Sh Soper eae —.10 gt 23 e oe 
1O! basement ‘06 =23 20 os -0 0 
11. Nurt ment 01 12 35 ‘08 =.14 1 
12. qurturance ‘05 18 31 ‘08 = —.08 a7 
13, ae — 08 au A? 05 03 —.07 
14, podurance 18 —9ge —05 24t ee 

ly 2105 .06 —.01 —.15 07 03 
ssion a "35 yy ut 33¢*  —.10 ‘00 


between the K Scale and th 


Note,— 


the & 
he correlation 


correlation ee present study, t 
ween K and Social Desirability was 63". 


* 
«0. 
#405 level of confidence. 


T ‘Ol level of confidence. 
varianco elations between the personality each case the differences are small, the num- 
OW with and the Difference Scale are also ber five is greater than one would expect if 
Significa; five correlations being statistically only chance were operating (Pp < 01). The 
again nt. Autonomy and Aggression are five variables, Deferenc 
correlations. e nance, Endurance, and Aggression, suggest the 
f small but possibly systematic dif- 


a 
Socia] ee the significant 
o ne ee scale deve 
ersonality © significant correlations wit 
y variables. One of these, Endur- ; 

ith the lations 


n the two samples. However, 
o not seem to affect the re- 
the PPS and social de- 
nd between the PPS and 


Joped by Edwards presence 0 
h the ferences betwee 


these differences d 
hips between 


Nee 
Ce, w, S 
Differen, S also significantly correlated W ations 
$ ‘Se Scale in the present study- sirability Sua : a 
«Correlation: 5 personality the anxiety scales n the two studies. 
s between the 1P On the basis of the present data and the 
Edwards, jt would appear that 


les presente! indi ; 
r resent study ndings 0 f F 
Der , p +.) desirabilit does not contribute ap- 

rel sonalit: $ «nificantly cor- social desira y p 
ted with i venan aa a x s preciably to the PPS scores. Only three of 
aylor MAS, the 15 PPS variables show somewhat consist- 

jal desirability in- 


Sona); 

alit 

With 2 Yatiab soni orrelated 

h t les are significantly ci ent correlation it 
e, Autonomy, ndur- 


lo e F £ 

Ww C anxiety scale Edwards foun 

Macut significa on dices. Among 
c i tween the 2 . 
ant correlations be ance, and Aggression, 


and th : : 

b and EA personality var aay 5 between the K 

ang .“atiabl trance. In the pie ; AS However; the fact that these variables are 
es are correlated with the d with social desirability indices may 


to p OVE vari i 
the ariable, Succorance, 1S also related o the construct va- 


C anxiety scale. d sp’s of li 
s 


arią 1 

i e 

able 4 the two anxiety sca 
indicate that in the P 


scale and Aggression. 


t Com: F 
a S Va nison of the means an d to relate 
3 . e. 
ho aaae for, the pen a Y tude in which the S describes himself = a con- 
Cant ed five of Se aay be signifi- ‘ormist, aS persevering, d as unwilling to 
i e 15 variables S thoug in manifest hostility- 


i 
fferent from each other. 


404 Robert E. 
The correlations between the PPS variables 
and the FC anxiety scale represent some rela- 
tionships between two scales which minimize 
test-taking attitude. The predictive validity 
of neither scale has been established, but the 
correlations do provide us with some informa- 
tion concerning self-report behavior. There 
seems to be a constellation of PPS vari- 
ables that correlates reliably with the FC 
anxiety scale. According to Edwards (1) these 
PPS variables are correlated with each other. 
The constellation includes Achievement, Au- 
tonomy, Dominance, Abasement, and Nur- 
turance. Achievement, Autonomy, and Domi- 
nance are negatively correlated with the FC 
anxiety scale, while Abasement and Nurtur- 
ance are positively correlated. It would seem 
that persons who obtain relatively high scores 
on the FC anxiety scale describe themselves 
as having little achievement orientation, as 
being conformists, as unwilling to be leaders 
as feeling guilty, and as desiring to help others. 
It is noteworthy that these relationships are 
revealed only when social desirability factors 
are minimized. 
Summary 


The test-taking attitude often referred to 
as social desirability, and the 15 variables 


Silverman 


of the Edwards Personal Preference Schedule 
were compared using two measures of social 
desirability. The findings support those of Ed- 
wards in that social desirability plays only @ 
slight role in influencing some of the PPS 
scores. Two variables do show a consistent 
relationship with social desirability and this 
was interpreted as possible indication of the 
construct validity of these variables. Correla- 
tions were reported between the two anxiety 
scales and the PPS variables. Some implica- 
tions of these correlations were discussed. 
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he Correlates of Manifest Anxiety in Perceptual 
Reactivity, Rigidity, and Self Concept’ 


Emory L. Cowe 


n, Fred Heilizer, Howard S. Axelrod, 
and Sheldon Alexander 
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Th tw 5 
Presented peeing papers, data have been 
Ween score icating a lack of relationship be- 
cale (29) s on the Taylor Manifest Anxiety 
Carning ta pnd performance on two complex 
and (b) as s: (a) stylus maze learning (2) 
this oe associate learning (14). These 
Studies A somewhat in contrast to earlier 
ave been T hich high anxious subjects (Ss) 
in ione oe to perform more poorly than 
Volving ne Pe on complex learning tasks 
Neies (13 ultiple competing response tend- 
_ The pres 17, 23, 25, 30). 
ies. h eport deals with the relation- 
On three. -scale response and perform- 
e personality-cognitive measures; 


Derg, 

€ptu ; 

Problem soln ay to “threat expectancy,” 

titudes, te ing rigidity, and self-regarding 
evolves from the same basic de- 


Sign 
as do ; 
a o 
tec ordingly We two predecessors (2, 14) and 
‘n >» Involves two methodological ex- 


ns A 
cr} Order To studies in the area. Thus, 
du Patory xamine the over-the-range 
M ed a mina of the A scale, we have 0- 
cents Scorers ae anxiety group, consisting ° 
My And etween the 43rd and 56th per- 
org ioe secondly, we have utilized the 
for. £0 ree as an additional criterion m 
late ance on rmine whether differences 1 per- 
E 9 vari th e dependent measures are re- 
Dre? the lations in L-scale score. 
Vlous in variable of perceptual 
vestigations (9, 20) repor 
research 
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thresholds for correct report of taboo words 
as compared to neutral ones. While these 
studies have been criticized (1, 11) in terms 
of failing to demonstrate perceptual defense, 
the elevation of thresholds of veridical report 
for taboo words under these operationally de- 
fined test conditions remains as verified data. 
The suggestion has been made (1) that while 
these studies may demonstrate intraorganismic 
defensiveness to noxious stimuli, such defen- 
siveness may more parsimoniously be viewed 
ccurring in processes other than the per- 
ceptual one (e€.8-; inhibition of verbalization). 
Apart from the question of whether or not 
such defensiveness is truly perceptual in na- 
ture, the thesis has been presented that anx- 
iety is the critical variable underlying the fre- 
quently observed elevation of thresholds for 
noxious stimuli (11, 21). If this is so, it 

ticipated that Ss with “built-in” 
would have greater elevation 
for anxiety producing stimuli 
taboo words) than would low anxiety Ss. 
The present experimental situation offers a 


test of this proposition. 
With respect to the measure of problem- 


solving rigidity, there is available a consistent 
of research which suggests that there is 
ip between experimentally induced 
problem-solving rigidity (6, T 
re be expected that 

there would be a comparable relationship be- 
tween scale anxiety aS measured by the Taylor 
MAS and rigidity in’ roblem-solving situa- 
ination of this relationship consti- 

d aspect of the present research. 


jety an 
7), It might therefo' 


con 

is the relationship 
d self-regarding 
n has been 


between A-scale anxiety and | 
the propositio 


attitudes. Here, 
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made that large self-rating discrepancies from 
differing frames of reference are correlates of 
a state of internal incongruity or tension 
within the individual (3, 5, 26). Accord- 
ingly, a correspondence between high anxiety 
and large self-rating discrepancies might be 
expected. The third aim of the current in- 
vestigation is that of examining the relation- 
ship between these variables. 

Subsequent to the collection of our data, 
two studies have been reported in the litera- 
ture which bear directly on the present in- 
vestigation with respect to the variables of 
problem-solving rigidity and perceptual re- 
activity to threat expectancy. Maltzman, Fox, 
and Morriset (19) administered the Taylor A 
scale to 48 Ss, who were divided into above 
and below the median scorers, and were given 
the Luchins water-jar rigidity task (18). Sig- 
nificantly more rigid solutions were given by 
HA Ss than by LA Ss. This finding is some- 
what surprising since their Ss were divided at 
the median and thus, presumably, the ma- 
jority had A-scale scores in what has been 
considered to be the nonsensitive middle range. 
An additional datum reported by these in- 
vestigators was that on an extinction prob- 
lem requiring a new and direct solution, there 
was a tendency (p = .16) for the HA group 
to obtain fewer correct solutions in the al- 
lotted time. 

Bitterman and Kniffin (4) have reported a 
study dealing with manifest anxiety and “per- 
ceptual defense.” Extreme HA and LA groups 
were selected on the basis of response to the 
Taylor A scale. Using a tachistoscopic ex- 
posure method, all Ss were shown four neutral 
and four taboo words. Taboo words were 
found to have significantly higher recognition 
thresholds than neutral words. However, there 
were no observed differences with respect to 
the amount of “perceptual defense” (i.e., 
taboo word thresholds minus neutral word 
thresholds) between Ss with high and low 
anxiety. 

Procedure 

Subjects. One hundred and two freshman 
men and women were selected from a sample 
of 276 on the basis of their Taylor MAS 
scores. Three groups were chosen: High Anx- 

iety (HA)—scores of 27 (90th percentile) 
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and above; Middle Anxiety (MA)—scores of 
12-14 (43rd-56th percentiles); Low Anxiety 
(LA)—scores of 6 (20th percentile) and be- 
low. The Ss in each group were as follows: 
HA, 28 (14 male and 14 female); MA, 32 
(18 male and 14 female); LA, 42 (28 male 
and 14 female).? 

Instruments and procedure. Self-regarding 
attitudes were measured by the Bills-Vance- 
McLean Index of Adjustment and Values (3); 
given in group administration. On this instru- 
ment, Ss are asked to make a series of three 
ratings along five-point scales, for each of 49 
trait-descriptive adjectives. These ratings ate 
(a) the way S sees himself with respect to a 
given trait (self-concept), (b) how much he 
likes being this way (self-acceptance), (@ 
how much, ideally, he would hope to be this 
way (ideal self). The discrepancy betwee? 
the self-concept and ideal self ratings summe 
without respect to sign yields a fourth meas- 
ure—the maladjustment index. Presumably) 
the greater the discrepancy between thes? 
two ratings the less stable is the self-conceP 
and the more maladjusted the individual. 
method for scoring this test is described else” 
where (8). 7 

One to three weeks later, Ss were seen a 
dividually for the testing of rigidity and peii 
ception on the basis of random assignment 9 
one of three Es. Problem-solving rigidity p- 
tapped by the Luchins water jar techniq"® 
(18), a task which involves the establish 
ment of a single mode of solution to 4 ae 
of arithmetical problems, following which d 
problem is given for which a more direc aE 
appropriate solution is available. In the P Ei 
ent instance, five set problems were usé 
then an extinction problem soluble only a 
new and direct procedure, The rigidity 5¢° is 
as used by Christie (6), was the time ° | 
Sponse to the extinction problem, allowine 
Maximum of ten minutes. ney 

Perceptual reactivity to threat espect ier 
was studied by a technique describe ear 
by Cowen and Beier (9). A list of 71 W? 
23 presumably threatening, was read init 
with each S being advised that he wow" " 

? These figures represent the number of 55 ao 
completed the anxiety and self-concept scales: map" 
these (3 LA, 2 MA, and 1 HA) failed to © 
the rigidity and perception experiments. 
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Table 1 
Mean Scores of A Scale and L Scale Groups for Perception and Self-Concept 
Lie* groups 
Anxiety* groups Total group Low A only 
Test Vari 
Variable High Mid. Low ; High Low High Low 
1. Percent; 
‘ (trials) D score 25.6 165 19.5 21.2, 18.5 22.1 19.4 
» Bill 
- Bille Eol. I 169.9 190.2 202.0 197.5 1844 208.0 193.0 
S LU 160.0 175.9 194.5 189.1 175.3 201.9 183.1 
i pas Col II 2 25 2835 222.3 219.9 226.1 218.5 
j ills Col 221.1 2 
‘ol. I—-ITTb 36.3 25.8 29.7 39.9 23.3 30.1 


55.1 i 25. 
(1.7119) (1.5403) (1.3721) 


SNr 
ba tOr the seve: P z 
in everal as e vi F 2and3. P, i 
Parentheticalh oe data were sures are as giver in rausformed Mores used to compute the # ratios reported in Table 4 are given 
of ten minutes, and were 
kewed. Accordingly, chi 


basis for statistical 


be ask 
T sked to decipher some of these words. tablished maximum 
highly positively s 


S 
were later shown a series of 16 book- 
as used as the 


lets, ea 

Sin ch containing opies of a square W e 
cap i e ettered -k a o in analysis of these data. Since ne was a natu- 
Most ppetters, and arranged in order from the ral break in the disipa o ame scores at 
Practj Urred copy of the clearest. After two approximately the median for the = a a 
lets lce Words, Ss were given the 16 test book- ple, this was used as the aie pat or 
m (eight threat and eight neutral), in ran- dividing the group. Thus T S a ee et 
Py oads os , allowing a maximum of three sec- tinction peta 90” or to arrive 
Pees of oe cach bol test a nee solution. Chi saua Pat at 
based sr Sra discrepancy score was compute the rigidity scores fi He i ae i aired 

for © ON the total number of trials require groups, (0) high OM igh and low lie grou 
p er oe total sample, and (c) high and low Ze groups 
low A groups,® were performed. 


Tre I 
, totay -Ect report of threat words minus the seen 
that there are no sig- 


port of s indicate 


deutpa Mber of tri 5 
ral trials for correct ' Tie e are 
Bitten Presen aaa Tear significant relationships be- 
al a experiment ditten from Me pee A-scale status and rigidity. Chi squares 

puted using finer cutting points 


first to fifth quin- 


and L-scale groups; but in 
f these analyses yield differ- 
hed statistical significance. 


era Man and Kni A HEET sev- 
niffin investigation 1 a 
“three edura] respects: (a) Fra use ol the MeS aoe 
a the “expectancy” situation (i:€+ alerting Ss HON H ale 
Oreh nature of th `g stimuli be- tiles), for 
a e threatening no case did any o. 


bo d : 
n ; (Ù) the successive cat J er 
ena chnique in Tes of me, toscopic pres- ences which app i al oe E 
Stip OR, a ieu of tachis Perception L. a 
Mulus nd (c) the increased number M AAR OF ees GORE en 
ee d recognition threshold was not significan 
A 8), suggesting that under the present 

oe ] word fre- 


m- erimental conditions differentia 
ae ; dity and subsequent 


ab], 
a Cea Lin 
an -Sca]oummarizes the means for A Sea s Lie scale analyses for the ae A ees 
l 3 © groups for the perception data are based on & oe ile (scores of three and 
of six and above) TA e total sample the Ns for HL 
For while the com- 


he 
sig gi Pt measures. iy- 
0 the analy- helow) groups. FOF 142 respectively, a 
ari À p are 


dit 
Po f 4 data..The basic score in f re jety grou 
R oat of e a xi 
tanase o igidity d ata was the time sae ble figures within the Low Anxiety 
8 extinction problem. These and 14 respectively- 


ed 5 
r pi ae 
om one second to the arbitrarily es: 
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Table 2 


Analysis of Variance for Perception D Scores 


Source df m.s. F 

Anx. 2 592.3 

Exp. 2 3487.0 

Sex 1 1373.7 

AXE 4 970.6 

AXS 2 1233.5 

EXS 2 548.2 

AXEXS 4 1768.3 3.06* 
Within 78 577.7 
Total 95 


* Significant at .05 level. 


quency was not a factor in determining re- 
sponse threshold. This finding duplicates an 
earlier one by Cowen and Beier (9), who 
used similar experimental conditions. 

The mean discrepancy score (total threat 
trials minus total neutral trials) for the group 
as a whole was found to differ significantly 
from zero in the direction of more trials re- 
quired for threat words. The obtained ¢ ratio 
of 6.24 was significant at beyond the .001 
level. This datum, too, is in accord with ear- 
lier findings by Cowen and Beier (9), as well 
as those of Bitterman and Kniffin (4). 

A three dimensional analysis of variance 4 
for perceptual discrepancy scores was com- 
puted, and the results of this analysis are 
summarized in Table 2. The anxiety X sex x 
examiner interaction proved to be significant 
when tested over the within error term. In 
order to use such a significant triple interac- 
tion as an error term, at least one of the vari- 
ables must be randomly distributed. Both the 
anxiety and sex variables fail to meet this 
criterion, and since we were unwilling to as- 
sume, post hoc, a random selection of Es, 
there is no meaningful way in which tests of 
the higher order effects in the table as a whole 


4 Because there were unequal cell entries, the analy- 
sis of variance used is an approximation (15) re- 
quiring that the following two assumptions be met: 
(a) homogeneity of variance and (b) random dis- 
tribution of cell entries. Homogeneity of variance 
was tested by means of Bartlett’s test and random 
distribution of cell entries was tested by a chi-square 
technique. It was possible here to demonstrate that 
both of the prerequisite assumptions had been ful- 
filled. 
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may be interpreted * (16). Accordingly, sepa- 
rate analyses of variance were computed for 
each sex, following the rationale detailed in 
an earlier paper (14). These data are sum- 
marized in Table 3. Neither Anxiety nor the 
Anxiety-Experimenter interaction reached sta- 
tistical significance for either sex. The Experi- 
menter main effect, however, is statistically 
significant for men. 

Originally, it was intended that a similar 
analysis of variance be done for the percep- 
tual D scores based on a regrouping of Ss M 
terms of their Lie scale scores. The assump- 
tion of random distribution of cell entries, 
however, could not be fulfilled in this case. 
was therefore necessary to compute 4 ratios 
between HL and LL groups for the total sam- 
ple and within the low MAS group. Significant 


or near significant differences between groups 
Table 3 

Analysis of Variance for Perception ae 

Group and 
Source df m.s. i 

Men 
Anx. 2 260.75 er 
Exp. 2 2456.15 aoe 
AXE 4 829.65 13 
Within 48 604.72 
Total 56 

Women 
Anx. 2 434.45 = 
Exp. 2 615.65 11 
AXE 4 925.90 Le 
Within 30 542.78 
Total 38 

* Significant at .05 level. - 


s ap 

i Since the analysis of variance used here a A not 
proximation method, the precision of which n gA 
known (15), £ ratios were computed belt ne o 
MA, and LA groups in all combinations. 1, The g 
these #’s approached significance at the .0 Je 
statistic was also computed eliminating L 
of 7 and above, but the results were note e 


this modification. n jf i 
i e n 
It may be noted parenthetically that ev in ehe 


within variance were used as the error te A 
ing the significance of higher order effects, o phe 
experimenter main effect would be significa no 
ratio for the main effect of anxiety W° 
proach significance. 
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Table 4 


t Rati i 
Satis Comparing Self-Concept Scores for A Scale and Z Scale Groups 


Anxiety groups compared 


Lie groups compared 


Ly A ae T 

M HA vs. MA HA vs. LA MA vs. LA A Towe ony 

re ——— 3 = 
t 

Col] P t $ t fd t p t 

Col. Iy 47 : ` 

A 74.001 8.36 3.55 5 
Col tb 3 .001 3.55 .001 285 O1 
or mr ee 01 6.72 001 3.67 .001 2.35 05 oor iG 
II o: T ns. 0.95 ns. 040 ns. 090 ns. 217 "05 

38.001 746 001 408 001 262 02 185 10 


for per 
fither Pea D scores were not obtained in 
es analyses. 
te group pu data. As the Bills inventory 
te Ect report ee the examiner main 
a riance is ed in the preceding analysis of 
pa sis of ae longer present. The primary 
ee à sie data consists of £ ratios com- 
the our Bills eral anxiety groups on each 0 
sots of am easures.® Table 4 summarizes 
the and Z-sc riper between means for 4- 
con “scale yt subgroups. With respect to 
dig Pt indi a, on three of the four self- 
erences ces there are highly significant 
selg, StOUp oe anxiety groups, with the 
Bro oNCept ee having less well adjuste 
Dog? and the N than either the MA or 
Oh self-co e MA group, in turn, having 
ate fourth Bac scores than the LA group: 
heat” signif ills measure (ideal self), there 
SS amon icant or near significant differ- 
Loe topina trey, groups. . d 
Dle „~Ìe-scale he data in terms of High a” 
within ce 3 both for the total sam- 
ifferenc e Low Anxiety grouP: sig- 
Com ot 4 ces between groups are again 
ale isons es Bills measures. FOr these 
Wate > Scored owevel, the High “Lie” grouP 
Lconpent the direction of more ade- 


aye *idiny i Discussion 

a 3 
ie a a quite clear that t 
y measures used in the 


fo, Ath 
r p thoy 
Dry these Bh a l 
ten, hens” data, oe ses of variance are entirely feasible 
e Veitan wee presented in this less com- 
in interests of space, an 


Mite for the 
tant 
ain effect of concern 


“Arcus, 


he ansi- 
present 


ed. The extent to which 
trast to the data reported 
earlier by Maltzman et al. (19) is less clear. 
The latter investigators reported significantly 
more rigid solutions to critical problems for 
High A Ss, as well as a tendency (nonsignifi- 
cant) for these Ss to have longer solution 
times on an extinction problem. Only the ex- 
tinction problem criterion was used in the 

resent research, the presumption being that 
by selecting extreme A-scale scorers, instead 
of simply above and below the median scorers 
as did Maltzman ef al, (19), the tendency for 
their above the median scorers to require more 
time to solve the extinction problem, would be 
accentuated to the point of statistical signifi- 
cance. This, of course, did not happen in our 


study. é r 
It is entirely possible that the extinction 
i Jiable and/or discrimi- 


time criterion is less re 
iterion of num- 


study are not relat 
this finding is in con 


ated success 


g he study by 
(19). While there are no data deriving from 
nt study which could satisfactorily 

is issue, it may be justifiable to point 
e fallibilities of the ex- 

used here, it has at 
demonstrated to be sensitive and 
n jonificantly modifiable 
jmentally indiiced stress (6, 
nt review of this area, Taylor and 
28), noting 2 comparability jn re- 

oy = e study by Maltzman et al. 


by Cowen (7), raise the follow- 


= AG experimentally 
anifest anxiety 
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as measured by the Taylor A scale?” (28, p. 
463). There is, by now, substantial evidence 
in the current literature to suggest that prob- 
lem-solving rigidity, operationalized either in 
terms of number of indirect solutions to criti- 
cal problems, or delayed solution of an ex- 
tinction problem, increases as a function of 
experimentally induced anxiety (6, 7, 24, 27). 
The data of the present study suggest that 
insofar as the second of these two types of 
rigidity criteria is concerned, there may be 
substantial differences between experimentally 

induced anxiety and A-scale anxiety. 
Perception. Both with respect to the pres- 
ence of significantly higher thresholds for 
threat words in the group as a whole, and in 
terms of the absence of differences in per- 
ceptual reactivity to threat and neutral words 
among anxiety groups, the present findings di- 
rectly support those of Bitterman and Kniffin 
(4). These investigators conclude “that anxi- 
ety plays no important role” in producing ele- 
vated thresholds to taboo words (4, p. 249). 
ee A A 
sion on several grounds: 


1. Bitterman and Kniffin tak i ETA 
fact that the Taylor A scale ae me EN tie 
anxiety. They write that “the fault might be said to 
lie with the criterion of anxiety which was used, but 
there is now considerable evidence to suggest its va- 
eat N a 
lidity (4 p. 249). The evidence referred to stems 
primarily from a series of learning experiments done 
at the Iowa laboratory (e.g, 13, 17, 23, 25, 30). To 
the extent that one is willing to accept such a cri- 
terion of validity, recent results from the Rochester 
laboratory (2, 14) and elsewhere suggest that this 

inferred type of A-scale validity is questionable. 

2. It is, of course, possible to define A-scale score 
as an operational measure of anxiety, but this in turn 
raises the question of the relation of this kind of 
anxiety to other kinds of anxiety. A recent study by 
Moffitt and Stagner (22) reports different findings 
for experimentally induced and A-scale anxiety in 
terms of relationships to five perceptual tasks, lead- 
ing these authors to question the assumption “that 
manifest and threat induced anxiety were functionally 
identical” (22, P- 356). 

In a somewhat different area, strikingly discrepant 
findings are reported by Eichler (10), using experi- 
mentally induced anxiety, and. Westrope (31) using 
A-scale anxiety, both in relationship to Rorschach 
response. Specifically, within the area of “perceptual 
defense” data, there is considerable additional evi- 
dence suggesting that experimentally induced anxiety 
leads to elevation of threshold for reactions to nox- 
ious perceptual stimuli (11, 12, 21). 

Because of the recurrent demonstration of different 
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results in experiments using A-scale anxiety as op- 
posed to experimentally induced anxiety, it is difficult 
to see how the Bitterman and Kniffin data, or our 
own data for that matter, bear on the relationship 


of any generalized concept of anxiety to perceptu 
reactivity. We would be willing only to conclude that 


A-scale response does not relate to perceptual reac- 
tivity to threat. 


Self-concept. Insofar as the present series 
of studies is concerned, the clearest relation- 
ships between A-scale response and other pe 
formance or personality measures are foun 
with respect to the self-concept indices. Here 
the relationships are marked; apparently, the 
responses tapped by these two types of meas- 
ures are quite similar. It should be note® 
however, that this relationship does not m 
volve an overt behavioral measure at either 
end. It is conceivable, therefore, that it j 
flects nothing more than a common test-taking 
technique running through two somewba 
transparent and structurally similar self-rep™ 
inventories. More specifically, what perhaps 
being reflected in these results is 4 common 
desire by some Ss to produce a «culturally 
good” record on both inventories. One clue 
the possible accuracy of this line of reason’ 
derives from the Lie scale analysis of the £ 
concept data. Here it was found 
Ss produced significantly better self 
scores than did LL Ss, in exactly th 
way that LA Ss had better self-concept 
than did HA Ss. When anxiety was be 
stant by the comparison of HL and d 
within the LA group, the HL Ss continue 
have more “adjusted” self-concept scores: 


Summary so 
The purpose of the present invest or 
was to examine the relationship betwee! 
formance on the Taylor A scale an 
of problem-solving rigidity, perceP 
tivity to threat-expectancy, and se” 

One hundred and two male an w 
were selected as High, Middle, 
scale scorers. All Ss completed the i 
E Inventory (self concept) i” a e Y 
ministration. Subsequently, 96 ° 
given the Luchins aite jar task of Po are 
solving rigidity, and a test of perceP jdu 
tivity to threat expectancy, in indiv? 
ministration. 


asur 
rea” 


Pona S Se 
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Be o relationships between A-scale response 
Boies i ormance on either the rigidity or per- 
en tasks were observed. Significant dif- 
a = ried anxiety groups were observed 
a bon f-concept indices, such that Low A 
ARA a characteristically had more ade- 
a a f-regarding attitudes. Interpretation 
fae atter finding is obscured, in that when 
a ata were regrouped, Ss with High “Lie” 
a s were found to have significantly more 
Lo quate  self-regarding attitudes than did 
w “Lie” scorers. 
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Barron's Ego-Strength Scale: A Replication of an 
Evaluation of Its Construct Validity’ 


Arthur S. Tamkin and C. James Klett 


Veterans Administration Hospital, Northampton, Massachusetts 


A previous paper (2) has evaluated the con- 
struct validity of Barron’s Ego-Strength Scale 
(Es) (1) by determining its correlations with 
other reputed measures of ego strength and 
general psychopathology, and by testing its 
ability to differentiate degree of pathology! 
as determined by psychiatric diagnosis. The 
measures of psychopathology were the F and 
the Critical Item (CI) scales of the MMPI. 
While the findings of significant negative cor- 
relations between Es and F, and Es and CI 
were encouraging for the question of validity 

the inability of Es to discriminate degree of 

sychopathology based upon psychiatric diag- 
nosis and the failure to demonstrate a signifi- 
cant relationship to Wechsler IQ scores seemed 
to constitute a serious challenge to its validity 
when applied to hospitalized psychiatric pa- 
tients. 

Since the number of subjects previously 
used was small, consisting of 15 psychotics 
and 15 psychoneurotics and personality dis- 
orders, it was decided to replicate the previous 
study with larger groups of patients from the 
same institution. One hundred MMPI proto- 
cols of the group form were randomly se- 
lected from the files of newly admitted or 
readmitted psychiatric patients. They were 
scored for Es, CI, and F, and data on Wechs- 
ler full scale IQ, age, education, and estab- 


1An extended report of this study may be ob- 
tained without charge from Arthur S. Tamkin, Vet- 
erans Administration Hospital, Northampton, Mass., 
or for a fee from the Americah Pocumentation Insti- 
tute. Order Document No. 5316, remitting $1.25 for 
microfilm or $1.25 for photocopies. 


lished psychiatric diagnosis were obtained for 


a major part of the sample. In most of 
essential respects, previous findings were cor- 
roborated at a somewhat higher level of sta- 
tistical significance. The coefficient of come 
lation between Es and CI was — .66; betwee? 
Es and F, — .56; and between CI and F, 80. 
Es again failed to separate the two diagnosti? 
groups at the .05 point. Additionally, it waa 
found that Es was significantly correlate wit 
IQ (.32) and education (.45) and uncorte 
lated with age (— .20). The significant mace 
cient between Es and IQ is contrary pe t 
previously reported finding of no significa 
correlation based upon a sample of sma 
N, and it brings the findings into i 
those of other investigators who hav 
firmed the relationship between Es. ane en 
telligence. While the additional corrobora ty 
of this fact is suggestive of construct vali i 
for Barron’s scale as a measure of eg0 streng o 
the confirmation of the scale’s inability ie. 
separate diagnostic groups presumably of ests 
ferential levels of ego strength still sugg to 
caution in the application of the Es 8 ý 
hospitalized psychiatric patients. 

Brief Report. 

Received May 6, 1957. 
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Configurational Analysis of MMPI Profiles 
of Psychiatric Groups’ 
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each record were made by assigning the higher 


» yersus “con- 
of two scales a plus and the lower one a minus 


Although the interest in “sign 


fi iona > 
A ’ approach in the interpretation of 
ychological test data has been concerned - (e.g., if the Hs is higher than D by one or 


a with the projective techniques, at- more T scores, the 1 — 2 comparison will be 

es has also been given to response “pat- +, while the 2 — 1 comparison will be —). 

L Spe ai with some of the more structured tests. Application of the analysis is illustrated in 

Paito ically, several studies have dealt with Table 1. Scales with the same T score were 

the ern analysis of MMPI data. A sampling of tabulated as equal, and as the number of 

ri results clearly reflects disagreement con- them was small they were not included in the 

EE its value as well as effectiveness in dif- significance tests. Chi-square tests of differ- 

ee diagnosis (1, 2, 3, 4 5) ences between the neurotic and schizophrenic 

" a 1952, Sullivan and Welsh groups were then applied to the sums of the 

\ an €chnique for an objective configurational pluses and minuses for each of the scale pairs. 

. "dba of the MMPI. The purpose of the For example, Table 1 shows that 25 of the 

Stare study is to investigate the usefulness 40 neurotics obtained a + score on the Hs 

for dns te of analysis of the MMPI profiles (1) > D (2) comparison and 15 obtained a 

x ifferentiating, diagnostically and behavior- ~ score, as compared to 15 and 23, respec- 

| Y, Psychoneurotic and schizophrenic groups- tively, for the schizophrenics (the Hs and D 

scores were equal in rn 

M i ords), yielding an x? of 4.1. The groups were 

An objecti thode) ane m p is, essen- differentiated at the .05 level or less by 18 
b tially th ns configurational anis Velsh (out of a possible 36) scale pairs. 

at developed by Sullivan 4” Welsh, ° The mean ages for the PnCGp and SchCGp 


Was 5 
applied to the MMPI records of 40 PSY- ore 32.6 years (SD 8.2) and 30.7 years (SD 
ducation levels were 


( 10) reported 


Chone 

urotics, herei ferred to as the E SA 

Psyc} , ereinafter referr 5.5) respectively; mean e 

neurotic criterion group (PnCGp) and 11.7 years (SD 2.9) and 11.9 years (SD 2.7). 
are not significant. 


hospital; 
Spitalized schizophrenics 
f these scale pairs was 


Neuro é 
hale see group consisted of half inpatients an iese validation 0 
Pal liygiene cinic outpatients. ie then carried out by repeating the above pro- 

ing a new group of 40 neurotics 


k Sy : 

Sena clinical scales of the MMPI was aS- cedure using 
Position. Be tiom 1to9 gone oo (pnVGp); again with an equal number of in- 
T e psychogram; i.€-, , and outpatient 
ticn 22 9. The MMPI profile for each Pa- pitalized schizophrenic: 

he 18 scale-pair com- 


i 
Ww F 
as coded, with the scale code numbers Piss validation, art 
i sfferentiated the original 


> ar 
anged i i + 
( ank-order. order of descending ee r parisons which differe: 
1 Fro r comparisons of all the sca'es groups Were used instead of the possible 36 
, g : 
m the Veterans Administration Hospital, and comparisons: Age an d education were com- 


ental 5 
Hygiene Clinic, Omaha, Nebraska. 
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Table 1 


Illustration of the Rank-Order Tabulation of the MMPI Profiles (Subject Pn-1, Profile Code, 213478569; 
Subject Sch-1, Profile Code, 98 (56)74312*) 


Subjects tals 
J Totals pie Gp vs 
PnCG SchCG; G SchCGP 
nCGp c. p PnCGp SchCGp ae 
Scale pairs Pn-1 Pn-2...40 Sch-1 Sch-2...40 cae i S pi 

1-2 (Hs > D) = F 35 15 15 23 4.1 oF : 
1-3 (Hs > Hy) Te = 27 12 16 22 5.7 st) 
1-4 + - 28 12 9 30 17.5 ¢ 
1-5 
1-6 
1-7 
1-8 
1-9 
2-3 
8-9 


a The parentheses indicate that the scali 
numbers are not tabulated. This accounts T 
pair. The total scale pair score for the above twi 


parable to the criterion groups. Sixteen of the 
18 scale pairs again significantly differentiated 
the groups. (Prior to obtaining test data for 
the second schizophrenic group, the PnVGp 
was compared with the SchCGp and the same 
16 scale pairs were significant.) These 16 scale 
pairs which significantly differentiated the two 
criterion groups from the two validation 
groups were then applied to a third group of 
50 outpatient neurotics (PnVGp»). 

All subjects in this study were males and 
were VA patients with the exception of 30 
validation schizophrenics who were patients 
in a neighboring state hospital? Comparison 
of the mean profiles for the two schizophrenic 
groups reveals that they are not significantly 
different. The schizophrenics were relatively 
acute cases and more than half of them were 
classified as the paranoid type. The 90 out- 
patients were diagnosed independently of 
their MMPI results; the extent to which the 
MMPI contributed to the diagnosis of the 
remaining patients is not known, but it is 


2 Grateful acknowledgment is made to Dr. J. L. 
Yager, Chief Psychologist at the Omaha VA Hospi- 
tal, for furnishing the MMPI profiles of the VA hos- 
pitalized patients, and to Dr. W. G. Klopfer, Chief 
Psychologist at the Norfolk (Nebr.) State Hospital, 
for permission to use the records collected by the sen- 
ior author while he was on the staff at that hospital. 


enclos¢ 
ve the same T score and that the particular scale pair(s) formed by. pnek scale 


e sum of the pluses and minuses not alwa i i h group for 
always being 40 in each g 
o profile codes would be 16 for the neurotic and 0 for the schizophrenic. 


the 
believed that in the majority of ce the 
diagnoses were minimally influence 4 
MMPI findings. 
Results jf l 


PEF dif- 
The 16 scale pairs which significant oo 
ferentiated the neurotic group from the 


Table 2 Fi 
Scale Pairs Differentiating the Neurotic an 
Schizophrenic Groups 
vS 

PnCGp vs. Pav Gp 

SchCGp _ ae 
LHs()>Hy@) 37 02 s oot 
2. Pd (4) 17.5 .001 30 on 
3. Mf (5) 8.5 01 224 00 
A Pa (6) 10.2 01 23 OM 
3: Pe 42 05 gg M 
$: Sc (8) 138 001 31 % 
T: Ma (9) 49 .05 13i ow 
8. D (2)> Pa (4) $4. 105 148 oo 
9. Pa (6) 44 .05 133 %2 
10. Hy (3)> Pa (4) 10.6 01 Eg O 
11. Mf (5) 9.5 01 16 ae 
Pa (6) 8.8 01 64 ‘98 
13. Ma (9) 6.3 .02 34 “oo | 
14. Pt (7) > Mf (5) 99 OL 156 ‘go! 
15. Pa (6) 10.6 01 AS 
16 Sc (8) EOR 


> 
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Table 3 


Frequency and Percentage of S 


cale Pairs Occurring in the Criterion 
Differences in the Total Groups 
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and Validation Groups and 


No. of "o 
Ea Criterion Validation Valid: Total X 
pai = = sn 
os Pn Sch Pan Sh Pn Po Sch Total Ż 
13-16 
23 5 22 3 38 83 8 58.48 
Pin (57.5%) (12.5%) (55%) (7.5%) (76%) (63.8%) (107%) i SE 
„12 16 16 13 8 36 29 
Ta (30%) (40%) (40%) (32.5%) (16%) (27.7%) (36.2%) 
5 19 2 24 4 11 43 53.18 <.00 
(12.5%) (47.5%) (6%) (60%) (8%) (8.5%) (53.8%) p 
Phrenj 
nic group (both criterion and validation judges were instructed to classify each profile 
chizophrenic. The only 


Brou 
Ps), and the levels of significance, are 
1 18 scale 


ceptable 
lidation 


Pre. H 
hie are in Table 2. Of the origina 
significan, two which failed to meet ac 
group a when applied to the va 
able y (a) 1> 2 and (b) 2> 5 
Sire presents an interva 
differ eney of scale pairs occu 
Pair 
ae lower (0-6) ranges repre: 
e neu ective cutoff points i 
rotics and schizophrenics, Tes! 


] tabulation of 
rring in the 
16 scale 


either ne 


1946, as I 
tive count: 


as either neurotic or si 
identifying information 
was the individ 
tion. They were told that 
urotic or schizophrenic, 
told how many of e 
dition, the six signs 


the diagnosis of subc 
lied to this same samp 


available to the judges 
ual age, education, and occupa- 
every record was 
but were not 
ach were included. In ad- 
reported by Meehl (6) in 
ater modified for purpose of objec- 
ing by Peterson (8) in his study on 
Jinical schizophrenia, were 
le of 100 MMPI 
by the judges. These signs 
. to be commonly employed in 


f 
ân eee of the above procedures a 
Pr ment of the data, 100 of the MMPI are said - 
es (40 out- and sige Mi neurotics, differentiating MMPI psychotic profiles from 
lect 40 schizophrenics) were randomly sé neurotic ones (8, P- 198).” The number and 
€d from the total population and submitted ercentage of this sample correctly and incor- 
pop rectly identified by the scale pairs, Meehl’s 


r : 
Wo ri experienced clinical psy 
urth-year psychology trainees. 


8 
Th 

© authors are deeply indebted to 

d 


chologists a® 
i 8 These 


the following 


the st 


udy: Drs. Go: 
fer, Irving Simos, 
Robert Scott. 


Bennett, W. G. Klop- 


rdon Filmer- 
ph Anderson and 


and Messrs. Ral 


Psych, u 
Ologists and trainees who serve as judges for 
Table 4 
sficati {PI Profiles by Scale Pairs, 
t Identifications of 100 MM 
Number of Correct and Tncoaen a ons udges 
Said e jt. Fa Bee JYT: 
pairs sign! ae z 
Group y 7 wer F T rae +- +- E 
aw Z 44 16 30 30 
Schizo e 21 41 19 33: 27 
i, Ge at Fr pis Hela! a en 25 15 
Tonia” Gh oa ea e ie Ae aC Aan 
ai 9 14 
Paranoid Ay ak ee 
19) 3 14 
otal (9) 16 stapes yn «3 60. 40 «7 228 55 45 
100 78 22 
than Meehl's signs and all of the judges (at the .05 level or less, 
tter 


Using No 
g Rote. z 
Fisher's eos pate identified the paranoid subgroup be 


———————“‘ 


— ee 
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Table 5 
Results of Chi-square Test Between the Judges, Scale Pairs and Mechl’s Signs 


Mechl’s 
signs J-VTr J-IVTr jim ju jl 

Scale pairs 5.41* 11.87** 1.29 7.57** 3.03 2.54 
J- 0.55 3.57 0.21 1.39 0.02 
J-IL 0.35 3.03 0.37 1.06 
J- 0.19 0.51 2.68 
J-IVTr 1.44 5.49* 
J-VTr 1.32 


# Significant at less than .05 level. 
#* Significant at less than .01 level. 


signs, and each of the five judges are shown in 
Table 4. 

The results of the chi-square tests of sig- 
nificance among the judges, scale pairs and 
Meehl’s signs are reported in Table 5. Only 
two of the judges differed significantly in the 
efficiency of their judgments, although the 
difference between others approached signifi- 
ponce: ag scale pairs correctly classified sig- 
nificantly more than j 
em two of the judges and 


Discussion 


The findings of this study indicate that the 
application of a configurational analysis tech- 
ngne for analyzing MMPI data yields an ef- 
fective means for differentiating groups of neu- 
rotic and schizophrenic patients. We believe 
that the type of analysis presented helps in 
understanding and evaluating the behavior 
processes involved in these patient groups, as 
well as being a very valuable aid in making 
a differential diagnosis. 2 

Many of the scales which do not differen- 
tiate the groups when their mean absolute ele- 
vations are considered become very significant 
when interscale relationships are investigated. 
For example, the Mf scores are almost iden- 
tical for the two groups, yet the Mf scale is 
significant in three of the 16-scale pairs; simi- 
larly for the Pd scale. The writers believe that 
this can be explained by interpreting the data 
within a psychodynamic frame of reference. 
The Sc and Pa scales reflect the greater dis- 
turbance in thought processes, more of a ten- 
dency to distort and, in general, the more pre- 
carious reality contact of the schizophrenic 
patients. These patients are attempting to al- 


leviate their anxiety by such defenses aS pri 
jection (Pa) and hyperactivity (Ma), as oon 
trasted to the neurotics’ greater use o g: a 
somatic complaints, repression and obsess 
compulsive behavior (Hs, Hy, and P17", 
schizophrenics’ relatively higher elevation e 
the Pd, Pa, Ma, Sc, and Mf scales (COMP? ye 
to Hs, D, Hy and Pt) is consistent W? 
generally accepted view that they ten Jsives 
more hostile, asocial, suspicious, imp? m 4 
and less bothered by feelings of anxiety, 
doubt, and less able to show guilt oF pas 
than the neurotics. Also, the schizophr a] 
have stronger feelings of family 4” re col” 
alienation and rejection. The neurotic $ 
flict usually takes place within himself eak! 
the schizophrenic, because of his eg° “iious 
ties with the external world, is more a 
against conventional practices an 
many of his emotions, especially 
more openly than does the neurotic: 
nificance of the Mf and Pd scales 1 | ly 1e55 
pairs may be the schizophrenics’ relative or 
adequate identification with the cultura r cor” 
of masculinity and greater disregat 
ventional behavior in general. 
The mean profiles of the in- an4, 
neurotics are similar and differ a he 0 
only on the Hs and Hy scales, wit Fe 
patient group having greater eleva 
both. The mean schizophrenic P!O st 
a mild negative slope, with the highet 
on the Sc scale (T = 69) and a rise ° 
scale, as contrasted to the diphasi® 
with a positive slope of the neuro 
The most striking difference betwee? 
sults of the two groups is the Dre 
and relative absence of Hs-Hy defe” 


ostili 
sig” 
The ale 


| 


wr 


Aai g 
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— by the schizophrenics. The large num- 
ne Bah ech cases probably accounts for 
tion ay haa (Pa, Sc and Ma) eleva- 
in ates heer is but which may be detected 

Tt is aps higher Pa-Sc combination. 
the bas en very difficult to differentiate, on 
dthtendte of the MMPI, the acute paranoid 
This is ‘se from the hospitalized neurotic. 
keep hi ja because a defensive patient may 
and dias bi a score well within normal limits 
of perse ecause undue sensitivity and feelings 
rotics oe may be present 1n many neu- 
in mali e scale pairs may be a valuable aid 
Aaron mg this differentiation, particularly in 
Ee z of schizophrenics such as this where 
range a o score was outside the “normal : 
cantly l of the judges misclassified signifi- 
mete More of the paranoid schizophrenics as 
pairs ics than did the scale pairs. The scale 
the eN consistently more successful than 
nic K in making the neurotic-schizophre- 
studie erentiation, which adds to the list of 
aN reported by Meehl (7) showing the 
ie a of the actuarial over clinical pre- 
Dredicti There was a tendency for successful 
limite wn to be related to experience, but the 
est 4 number of judges and the fact that the 
and poorest performances were by those 


least experi e 
cone} Perienced, prohibit the drawing of any 
lationship of €x- 


of 


at i i 
in those cases where there } 


of d; I 
ne rentiating between neurotic and schizo- 
nic conditions, the cautious application of 
useful. If the 


esi 
Patien E pairs may be quite l 
s likely scale pair score is 2 13 or < 0, 
SPectiva to be neurotic or schizophrenic, i 
Tange ely. Any score falling within the 7-12 
(Subs should be considered «indeterminate. 
Veals rh nee research by the senior author Te 
ases at latent schizophrenics, neurological 
anq 12. p “normals” often score between 
Sores This would contraindicate using any 
falling in the indeterminate range to 
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classify an individual as being “more like” a 
neurotic or schizophrenic.) 


Summary 


A technique of objective configurational 
analysis was applied to 210 MMPI profiles 
(two groups of schizophrenics and three 
groups of neurotics). Sixteen scale pairs were 
obtained which significantly differentiated two 
criterion and three validation groups of psychi- 
atric patients, and cutoff ranges are presented 
which identify them at a very high level of 
confidence. Application of the analysis is 
shown to exceed the differentiating efficiency 
of three experienced clinical psychologists and 
two advanced psychology trainees. Discussion 
of the findings with respect to differential di- 
agnosis and behavior processes is presented. 


Received January 24, 1957. 
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Scoring Intelligence on the Lowenfeld Mosaic Test’ 


Malcolm H. Robertson 
University of Mississippi 


The Lowenfeld Mosaic test is a projective 
test of personality. As with other projective 
tests, a difference of opinion exists regarding 
the use of this test to evaluate intelligence, 
Thus, Diamond and Schmale (1) concluded 
that there is no relation between performance 
on the Mosaic test and IQ. However, McCul- 
loch and Girdner (2) reported a correlation 
of .43 between estimates of mental age and 
Stanford-Binet mental age. 

The hypothesis of this investigation was 
that meaningful categories could be developed 
and that scores on these categories show a 
predictive relationship to intelligence as meas- 
ured by the Stanford-Binet test. 

Ninety elementary school children were di- 
vided into three groups of 30 children each on 
the basis of their IQ levels, above average 
average, and below average. The controlled 
variables were sex, father’s occupation, and 
grade level. The children had been referred to 
a school clinic for a variety of reasons. 

Four psychologists judged the Mosaic pro- 
ductions of each child with respect to general 
intellectual level and four independent cate- 
gories. The categories were Movement, Origi- 
nality, Complexity, and Gestalt. The scoring 
was completed on a three-step rating scale. 
Chi-square values and contingency correla- 
tions were computed. 

For Complexity and Originality, two of the 


1An extended report of this study may be ob- 
tained without charge from Malcolm H. Robertson, 
Student Counseling Service, University, Mississippi, 
or for a fee from the American Documentation Insti- 
tute. Order Document No. 5914, remitting $1.75 for 
microfilm or $2.50 for photocopies. 


f PRE: ly one 
four correlations were significant. Oniy 


d 
We j nt, an 
correlation was significant for Moveme A i 


° e four 
Gestalt. The average correlations for th 


judges for Movement, Complexity, d 
and Gestalt were .22, .33, .38, an¢ at 
spectively. Correlations between a inet 
general intellectual level and w 
IQ levels were significant for two O 
judges (.42 and .48). z 
Interjudge agreement on estimates °? te 
eral intellectual level and the soor tistically 
gories was generally significant siaa igh 
though the intercorrelations were the 
Intrajudge consistency in scoring Jevel 
gories and estimating intellectual e pear 
high, although the categories still apa 
measure different aspects of performan f gure 
Thus, the Stanford-Binet seems to "rge 
aspects of intelligence which dil test. 5 
from what is tapped by the Mosto r 
estimate of the general intellectua a be 
the Mosaic test performance aPP® i jsti 
based to some extent on such pe ani?” 
as the complexity, originality, 2™ 
tion of the Mosaic productions. 


Brief Report. 
Received May 10, 1957. 
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The Social Desirability Stereotype in a 
Hospital Population 


C. James Klett 


Veterans Administration 


I 3 
sirability o studies (2, 6) of the social de- 
M the oe values of the items appearing 
(EPPS) a Personal Preference Schedule 
ena ), it has been demonstrated that 
sirable a, fears what constitutes socially de- 
ably stebi undesirable behavior are remark- 
ively Tide from group to group and are rela- 
educatio ependent of such variables as a66, 
cultura] E sex, and socioeconomic status. Sub- 
‘ujita (5 ifferences have been investigated by 
ends th ) and Lovaas.! The present paper €x- 
Of male ese observations to include a sample 
atric ge hospitalized for neuropsy i- 
is orders, 
Patient a be intrinsically interesting if such 
cial deg: eviated systematically from the so- 
Sirability stereotype of the nonhospital- 


1p 
ers 
Onal communication. 


Table 1 


Hospital, Northampton, Massachusetts 


Furthermore, if such were the 
case, it could be expected to be a determinant 
in their performance on the EPPS since it has 
been shown that even slight differences be- 
tween groups on the matching variable can 
affect probability of endorsement in forced- 


choice tests (6). 
The 118 subjects used as judges in this 
for the most part, drawn from the 


study were, 
admissions ward and consisted of manifestly 
disturbed patients, many of whom were ac- 
tively delusional and hallucinatory. An addi- 
tional five patients refused the task. By means 
of the hospital records, 89 of the patients were 
classified as psychotic and 29 as nonpsychotic. 

asked to indicate on a 


The subjects were 
their judgment of the desir- 


nine-point scale 
ability of the behavior denoted by each of the 


ized groups. 


Hospitalized Groups in Terms of the 


], and 
gh Scho, ical Needs 


The Relati ding of College, Hi 
ie san nE Sal Desirability of 13 Psychologic 
. ital College 
3 Hospital ma vs. 
sychological ya: sh School High School 
needs College High SEE $ 
ital College 
S Achievement College EE A High School 
. Deference Hospital” Hospital" High School 
4 Order Hospital* High School High School 
- Exhibitionism College Hospital College 
> Autonomy Hospital High School* College 
- Affiliation College** Hospital College* 
+ Intraception College* High School College 
Succorance College Hospital High School . 
1 Dominance Hospita Hospital College 
- Nurturance Hospita High School* Heh School 
College A ollege 
a Hospital High School* 


05) 0° 7 
ley the 
e] Me; 
ee gt confides. this group is significan 


nii a 
ficance at the .01 level. 


Sy 


1. 
i, oange i 
13. A dinae Hospital Podiel 
a Hospital : 
i up at the 

i he mean of the comparison group 
jably desirable) than tl 
tly higher (more soc! 
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140 items scaled by Edwards in his develop- 
ment of the EPPS. Cumulative frequency dis- 
tributions of the judgments were made sepa- 
rately for each item and the median interval 
values of the psychotic and nonpsychotic 
groups were compared. Since there was no 
essential difference between the two groups 
in terms of the intervals in which the median 
judgments of the items fell, the separate dis- 
tributions were combined and the social de- 
sirability scale values of the items were ob- 
tained by the method of successive intervals 
(1, 4). These scale values correlated .88 with 
the scale values of the college group (2), and 
.87 with those of the high school group (6). 
Although these correlations are quite high, 
both are significantly lower than the correla- 
tion of .93 obtained between the college and 
high school scale values.” 

In order to determine whether differences 
among these three distributions of scale values 
were random or were systematically related 
to item content, the scale values of each of 
the three distributions were transformed into 
relative deviates, and comparisons were made 
for each of 13 psychological needs included in 
the 140 items.* Each need was represented by 
ten items. The results of the matched pair 
analyses of variance are summarized in Table 
1, which indicates the relative standing of 
each group in terms of the social desirability 
of each need. 


Discussion 


Although there was considerable agreement 
in what constituted socially desirable and un- 
desirable behavior between the hospitalized 
group and the nonhospitalized groups, there 
was significantly less agreement or adherence 
to the social desirability stereotype than was 
found between the two nonhospitalized groups. 
The hypothesis that this disagreement was in 
any way related to pathology did not receive 
support from a comparison of psychotic and 
nonpsychotic patients, but it might be argued 


2 Since scale values were not available for 6 hetero- 
sexuality items in the high school and college group 
and one additional item was not scalable, these cor- 
relations are based on an V of 133. 

3 The EPPS is made up of 15 psychological needs, 
but abasement and heterosexuality are not included 
in this analysis because all of the scale values were 
not available. See manual (3) for definitions of the 
needs. 
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that such a dichotomy does not provide st | 
ficient contrast in degree of pathology, besides 
being subject to erroneous classification. + 
conceivable that the judgments of 4 ee 
neurotic might be more seriously affected t < 
those of a borderline schizophrenic who 15 F 

so acutely ill. 

Perhaps a better test is provided Hy i 
comparison of the combined hospital on 
with the two nonhospitalized groups 7 ie 
of individual needs. If the disagreeme? p 
ent was due simply to sampling errors, a 
errors introduced by generalized factor: aa 
lated to pathology such as inattentive 
confusion, bizarre responses, etC., y condon 
expected that such errors would be ma as 
and not show up as systematic differen cal 
a function of item content (psyche eel 
needs). Such was not the case. As ca” mpared 
from Table 1, the hospital group a5 “te ted, % ; 
with the nonhospitalized groups a t 
a significant level, that items relating ore 5% 
ference, Order, and Aggression Were n to 
cially desirable and that items relatos. or 
filiation, Introception, and Change wer ite o 
undesirable. It would seem that, 10 he soci! 
the substantial overall agreement t system 
desirability stereotype, there are some needs, 
atic differences in terms of individu. rent 

In the EPPS, items representing Pe pas? 
psychological needs were paired oF es for P 
of their social desirability scale gae joie 
college group, the objective being to 
the effect of social desirability 0” f 
bility of endorsement. When adm 
groups having different concepts 2 ikelY 
socially desirable, the item pairs oe Cel 
be less adequately matched and, cone lik’ 
the probability of endorsement 15 
to be affected by social desirab 
proved to be the case with the E i 
group even with very high agreemen ere 
desirability scale values (6)- Tf T pilit¥ 


ficient disagreement in social desira” peat va 
systematic nature, this effect CO" pps spe 
a difference in the means of the 17’ cases pef 
for the two groups. If such were e wie 
hospital group could be expecté i w 


e 
than the college group on thos? peivable? w 
by its members to be more of 
lower on those needs judged t° o 
desirable 
; Ai 
Some confirmation is provided 
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the EPPS 
‘aint gathered from a group of 40 care- E 
in Siea paranoid A hrent males eo es ae re wees 
igher ae ospital.* They had significantly erans hospitalized for pens ‘ts 
durance than Es a Order, and En- orders. The results were as follows: Ja 
and significantly = e college normative group 1. There was no essential difference between 
ism, Dominan wer means on Exhibition- psychotic and nonpsychotic patients in their 
conformity b ce, and Change. The degree of judgment of the desirability of items. 
ent findings etween these data and the pres- 2. There was a high degree of relationship 
filiation. v he enhanced by the findings on between the hospitalized group and the col- 
group alion ya was lower in the paranoid lege and high school groups in the scale values 
Endurance iF 1 failing of significance, and by of the items. 
- cially desirak ich was judged to be more so- 3. In spite of the high agreement, there were 
ailed to rea = by the hospital group but just $ stematic differences in the social desirability 
It do significance. of subscales representing psychological needs. 
ifferences E be concluded, however, that 4. There was close conformity between 
S Fon the means of subscales of the those subscales (needs) showing significant 
_ What the group to group are a function of differences *® social desirability and the sub- 
j Mather oo believes to be socially desirable scales showing significant differences in means 
Need stre an reflecting differences in relative OP the EPPS for a hospitalized group. 
Ments of i It is just as obvious that judg- Rec 
unction of at is socially desirable are in part & 
the dominant traits of the judging References 
ling of stimuli by the 


eived January 25, 1957. 


toup 
as it i P 
Ors is obvious th bility of en- 
Soci ement of an ite = ye ii y à 1. Edwards, A. Ls The sa 
t al desirabili m is a unction o g method of successive intervals. J. appl. Psy- 
are i 1 ity scale value of the item. The chol, 1952; 36, 118-122. 
Solution Rae's | confounded. Edwards’ 2. Edwards, A- L. The relationship between the 
0 ot j desirability of a trait and the proba- 
She e pro t udged y. p 
is ; to date at plese seen to be the bie bility that the trait will be endorsed. J. appl. 
fec tPossibl ‘ith the limitations that: (a) it Peyekol, 1958, 37, 9028: 
hie and e in practice to match items Per 3, gawards, A. L. Edwards Personal Preference 
lity ; (b) that in spite of striking sta- Schedule. Manual. New York: Psychol. Corp., 
1954. 
of attitude scale con- 
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y stereotype 
on-Century-Croft, 


| 
p in i 0 
| ete can ee e social desirabilit 
gro systematic differences from group 4 Edwards, 
tai up th i truction. New York: Applet 
n at m ob- struc 
< ned on the EPI influence the results me wie 
S: 5, Fujita, B. An investigation of the applicability of 
{ f the Edwards Personal RE Schedule to 
Summ ultural subgroup: Unpublished master’s the- 
pte 140 i =y ial ae Univer- of Washington, Seattle, 1955. 
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The Rorschach Performance of Epileptic Children 


Merville C. Shaw 


Chico State College 


and William M. Cruickshank 


Syracuse University 


, these studies Teport cer- 
tain major difference: 


nonepileptic patients, 
stated that epileptics 
color naming, C resp 
total number of re 
compiled a list of 


of brain damage. 

Altable (1) reported the 
formances of thirty epileptics of widely vary- 
ing ages and with many different types and 
degrees of epilepsy. He stated that 70-75% 
of interpretations were based on F, that there 


a scarcity of 
epts, and ex- 
dy by Arluck 


Rorschach per- 


Other studies, including those of Guirdham 
| (4), have reported differences 


ari ve sometimes been 
radiction to one another, 


Method 


; ount 

A number of factors may possibly alts ob- 
for the sometimes contradictory failure to 
tained in studies of this sort: a) eptic sub- 
use control groups, (b) use of =m on 
jects of varying severity and ee data. 
inadequate statistical treatment nd to over 
The current study was designe tcod 
come, insofar as possible, these i athic eP” 
Twenty-five institutionalized i a were 1° 
leptic children classed as grand Dileptic chil- 
dividually matched with 25 noneP ations jP 
dren from other institutional PE ence. CHY 
the basis of age, sex, and intel = oup if m° 
dren were not included in either g defects) sg 
tor defects, uncorrectable pen jndic@ 
any history which might conceiva E 
brain damage were present. f the subjes, 

The mean chronological age © rs 3 mon 
in the control group was 14 yea eats , 
and in the experimental ero subjects wll 
Months. The mean IQ of CO eee g? 
the control group was 81.52; D of the 
subjects, 81.16. The 1937 revisi ead 
ford-Binet was used as the meas 

ence, ini 
l The Rorschach was then meee 
all fifty children. The pasion a 
three times by the senior i a 
scoring came almost immedi econ 
administration of the test, the nonth, 
interval of approximately a or mont? py 
third after the interval of ano anges ° 
the final scoring only three C atche 
kind were made. The ¢ test for ae 
was used to compare the Tae p 
mental and control groups on t 
schach scoring categories. 


prst 


ariow 
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Table 1 


nii f 3 z 
Significance of Differences Between Epileptic and Control Groups in Rorschach Lo 
p: R 
cation, 


PIAS i Difference 
Söte Epileptic Control Between 
Means Means Means t P 
Location 
Ww 
3.24 4.10 8 
D 86 1.21 
16.84 13.36 3.48 151 A 
2.36 1.28 1.08 1.29 a 
Determinant R3 
F 
F 10.12 8.16 1.96 1.41 n.s. 
pia 5.04 3.68 1.36 1.86 10 
a 1.32 1.48 16 “48 as 
Fv 32 16 16 88 mie 
ae 72 32 40 2.38 5° 4 
s 1.00 1.32 32 1.09 Aa A 
108 ‘52 ‘56 ‘85 i Alloa 
Content i tL aaa 
s ical age 
No. different types of ge age of 
content 8.04 6.00 2.04 2.05 F 
o. different types of Be, mana 
p Scondary content 92 1.20 28 97 0 years to 
A 3.32 4.08 .76 1.79 of 1 year, 
ta 9.84 10.24 40 28 ‘ween the 
R 1.48 1.00 48 1.88 :05 level 
H, 1.88 1.96 08 07 n. 
d 2.04 1.04 1.00 1.73 Al 3 
Ee 27.58 28.56 eu soe nsmnsist- 
sdical 
z een 
In the case of determinants where a plus o. 
Results ae 3 x E 
Analy. F minus scoring 1S possible the signs were dis- 
f the 7 sis of the location of percepts by us€ regarded. It was felt that the nature of the 
Nifican i test indicated that there were no S16- determinant, rather than its quality, was the 
Y of alfferences between the two groups On important factor. The procedure was also de- 
Sult e three variables involved. These 1è- sirable from the statistical point of view, to 
increase the number of frequencies of some of 
The determinants for which ¢ 


Sar V 
e summarized in Table 1. 
tween the two 


ants was Car- 


Nalyc;. 
Stou lysis of the differences be 
e frequencies 


Teq a With regard to determin 
Were a in those cases where th 
Th , USh enough to make it a valid procedure. 
Ee eae of cases it was not possible to 
Caseg a cause of the low frequencies. In four 
h Cal, dr C Means of the two groups were identi- 
“ating carly so, and no computations were 
so Sumo” This was true in the case of CF, 
ho termir and FT. In the case of the CF and 

oth < Minants, the means were the same for 
rete L fenns: For the aum of C, the means 
mt e 2 for the experimental group and 2.00 
nè sentro] group. For the FT determinant, 

hq .72 n was .80 for the experimental group 
for the control group. 


the scorings. 


was computed include FV, FC, FY, C, F, F—, 


and S.-In only one case, the FV determinant, 
was there 2 significant difference between the 
two groups, at the .05 level. The results are 


summarized in Table 1. 


] tests were applied to the various 


The ¢ test indi- 
the .05 level on 


types of 
This was feund 


cated only one 
al 
on the number o 
Differences signi 
found on the P, A 


A summary of the d 
the two groups on the content responses may 


be found in Table 1. 
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Zita CECE Grond 


= Rts Axerage Rorschach 
ia e NES Group 


Average Number of Responses 


Epileptic Control 
Type of Response Group Group 
Response Total 22 19 
WwW 3 4 
D 17 14 
Då 2 1 
F 10 8 
g= 5 4 
M 1 t 
c 0 0 
CF 0 0 
noe 1 1 
state. o 0 
color n 1 
total nu 0 0 
compiled | 0 0 
are indic’ 1 0 
though } D 0 
with epi” 1 
8 8t, 
naming 2 ypes Spes 
time p 2 1 
more D 10 
TO hore 1 
Ap 29 29 
pobtal time Panty fe 
Poa per response os 12 2% 
TA 10.61” 16.74” 
ime/1 color R 12.24” te 
Time/1 black R 9.79" 13.86 
Rejections A 17.76" 
Exp/Bal 1⁄2 1 
Approach W:D:D = oa 
xpected Appr =a :D:Dd 
Poplar Pproach a 3/14/2 
Secondary content 1 4 
1 
Sum of C i 1 
2 


response, or total time. The num j 

€ Eak i er of hs 
tions was not significantly different in os 
groups, nor was the response to 4 


Mgrs tal of 
group significantly different from the iter i 


Merville C. Shaw and William M. Cruickshank 


Í 
ttern 0 
It is interesting to see bape oni 
ge Rorschiuch of Che GIGIR 
6. site Seared to thar of the contro 


e rison is presented in Table 2. 
fate Nba deny, oie eel oer to the nearest 
whole number. It is perhaps important to note 
that without the application of proper ome, 
cal tests, differences other than those ponn 
up by the statistical results would be easily — 
inferred. 


Summary 


The results of the present study fail tg a 
firm most of the alleged Rorschach indica 
of epilepsy. The results may possibly be nt 
counted for in several ways. First, the aa a 
study seems to be one of the first in whic b- 
rigorous system of matching experimental et 
jects with control subjects has been use 
previous studies the experimental groups be A 
for the most part, been compared only pons 
the so-called normal Rorschach pattern. tica! 
ond, the current study made use of sA 
tests of significance, rather than apr 
Comparisons. Finally, this study was ba epi- 
to one particular diagnostic category 0 ther 
lepsy of homogeneous severity, while i pes 
studies have tended to lump together all g: j- 
of epileptic patients without regard tO the 
ology. On the basis of the present stue 
Rorschach does not appear to be a Ei 
clinic tool for the differential diagnosi 
idiopathic epilepsy. 


Recevied December 21, 1956. 
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The Validity of Some Current Tests For Organicity 


James W. Parker 
US Army Hospital, Fort Ord, California 


Some of the most difficult problems of diag- 
nosis and evaluation are with brain-injured 
Patients. Consequently, a large number of 
them, either with known lesions or merely 
those with suspected lesions, are referred to 
the clinical psychologist for assistance and 
evaluation, All too frequently, however, the 
Psychological techniques for investigation of 
Organicity fail to offer significant contributions 
to the understanding of these patients, and the 
Psychologist must report that his findings are 
Equivocal, 

In a previous study dealing with the psy- 
chological investigation of brain damage, the 
Writer noted that tactual-kinesthetic percep- 
ìon could be used as an effective technique 
Se diagnosing brain damage (7). In Eri 
Ourse of that investigation, data were Co- 
ected from several other psychological tech- 
niques frequently employed as adjuncts in the 
‘agnosis of injury to the brain. In order n 
ow some light on the diagnostic pate 
es techniques, the present report Y p of 
thee the findings resulting from an anā ys 

ese data, 


Subjects 


The isted of 30 
i siste 
experimental group CON in definitely 


adult a K } 5 
dig, ale hospitalized patients W! ee 
qenosed brain une! fairly recent ae 
ten „ Of them had been injured within A be 
tange ths of the time of this study. re 
Years 12 chronological age from 21 AHE 
Ment. With a mean age of 25 years. 
forg R ages, obtained from the as 
Month €treat Scale, ranged from 9 yea 2 oe 
S, to 18 years, 7 months, with an av a 
à ental age of 14 years, 1 month. Patien 
Shi Manson ji monstrated the 
Ya Bartira ee To p a reasonably 
Measure of intelligence. 


with gross sensory and motor defects on the 
dominant side, neuropsychiatric patients, the 
noticeably mentally deteriorated, and patients 
with severe visual defects were not included 
in the investigation. 

The control group of 30 adult male patients 
who revealed no evidence of brain damage was 
drawn at random from general medical hos- 
pital wards. They ranged in chronological age 
from 19 to 40 years, with an average age of 
23 years. Their average mental age was 15 
years, 2 months and ranged from 10 years to 
18 years, 7 months. The difference of 1 year, 
1 month in average mental ages between the 
two groups was not significant at the .05 level 
of confidence. > 

The importance of a control group consist- 
ing of hospitalized patients receiving medical 
care for a physical disorder has too often been 
overlooked by investigators. Studying only a 
brain-injured group and a normal group, pat 
has often been assumed that obtained differ- 
ences were due to the brain injury. ee 
this is not necessarily the case. There is the 

ibility that the obtained differences may 
possi ndition and, there- 
be due to any e case of Ss with 
fore, found no 


brain damage: 


organic co 
t only in th 


Procedure 

The Ss were referred to the Enpa 

their ward physicians, and apparent fas 

; d it asa part of the routine procedure 

ep d to facilitate understanding of their 

pony n. The following tests were be sae 

i t esi 

7 Visual-Motor Gesta. : 

ae em Retreat Scale, Wechsler- 
jpley- 


2 The Bender. ete ase 

Tie oie KE oh the uninjured Ss beyond 
rain- 

i 1 of confidence. 


previ 
ated thi 
the .01 leve: 
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Bellevue block-design subtest, and following 
a ten-minute rest period, the Weig]-Goldstein- 
Scheerer Color-Form Sorting Test, and the 
Wechsler Memory Scale. These tests were ad- 
ministered as prescribed in their respective 
manuals. 


Shipley-Hartford Retreat Scale. If Goldstein is cor- 
rect in his emphasis on the brain-damaged patient’s 
“loss of the abstract attitude” (3), and if the claim 
is valid that the Shipley-Hartford Retreat Scale 
measures impairment of the capacity for abstract 
thinking (8), then the Shipley-Hartford should be 
effective in diagnosing brain damage. It was ad- 
ministered to the two groups of Ss, and their results 
compared in order to test the ability of this scale to 
oe brala damage in this population, 

Block-design subtest of the Wechsler- 

Scale. It is said that the block designs of toed 
ler-Bellevue differentiate most clearly between brain. 
injured patients and normal controls (9). For this 
reason it was included in the present study, and the 
scores made by the experimental and control pips 
were i i i ;, 

compared. Patients with intracranial pathology 


may be expected to experience parti i 
this test, for as Teuber eee creas oe 


e task on 
ding os an 


. t i 
terns by arranging the figures in a definite ERS 
aterials, “| | 
l I es in diff 
heaps or piles, not being particular about the spacial 
position of the individual figures within each heap” 
(3, p. 111). ‘ 


In order to determine if these claims would apply 


James W. Parker 


to the Ss of the present investigation, a comparison 
of the performances of the two groups was made. 
Frequency of pattern-building was considered, and 4 
simple scoring system was devised so as to have @ 
quantitative measure of success or failure on the task. 
Those Ss who proved able to sort and shift their 
method of sorting voluntarily received a score of 2; 
Ss who learned to shift from color to form receive 
a score of 1; and those who neither shifted volun- 
tarily nor learned to shift their method of sorting Fe 
ceived a score of 0. : 

Wechsler Memory Scale. It was decided to include 
the Wechsler Memory Scale in the present © A 
ment in view of the author’s statement that this tes 
“. . . should be useful in detecting special memory 
defects in individuals with specific organic brain a 
juries, and may prove of concrete value in the e5- 
amination of some of the soldiers and sailors returni 
ing with head injuries” (10, p. 90). In addition pi 
comparing the two groups on the basis of mene 
quotients derived from this test, their performance” 
on one particular subtest, visual reproduction, wi 
considered separately. Graham and Kendall at 
studying the performance of brain-damaged pe 
on a memory-for-designs test, observed that ae 
Patients made somewhat lower scores than di 
control group, 


Results 


Shipley-Hartford Retreat Scale. The Si 
ley-Hartford conceptual quotient, which in 
said to express the degree of impairment 
abstract thinking ability (8), failed to a 
criminate between the two groups in the Pi 
ent experiment. The brain-injured group the 
a mean conceptual quotient of 88.3, while at 
controls average 80.3. This difference of 8), 
Points proved to be not significant at t ther 
level of confidence, which would suggest o, eri- 
that these brain-injured Ss were not ety 
encing a loss in conceptual thinking abi to 
or that this scale is actually not sensitive gs 
such deficits, In view of the negative findi os- 
of previous experiments (1, 6), the latter p 
sibility is not unlikely. test: 

Wechsler-Bellevue Block-design sub r 
i 1S subtest differentiated the brain-inj nt 

rom the uninjured Ss at a highly sig” + red 
level. The average raw score for the in) was 
2 17.4, while that of the uninjure i be 

5.0. This difference of 7.6 is significa?! jgn 
Yond the .01 level of confidence. If the Me 69 
Score of 21 were used as a cutting st pe 
Per cent of the brain-injured Ss would 1 uld 
Ow this, and 37 per cent of the controls Woow 

° included in this lower half. Scores ” sive 
16 would appear to be particularly su88° 


experi- ~ 


ned EE EE SE eS eee 


SS a 


~~ 


The Validity of Some Current Tests for Organicity 


of brai 
brain damage, for while 30 per cent of the 


experi 
antes group scored below this, only 
low. Furthe of the control group scored this 
high score sa these data suggest that very 
amone bette this test are rarely encountered 
per cent cain at Ss, for while nearly 50 
only 3.4 the controls scored 27 and above, 
scored . per cent of the experimental group 
Wei this high. 
ing aig ngt el mg Color-Form. Sort- 
to the af ith this test scored according 
analysis ane method, a chi-square 
control fai icated that the experimental and 
this basis xo could not be differentiated on 
Abilities to. sox only were they alike in their 
methods of sort the stimuli and shift their 
cant differ sorting voluntarily, but no signif- 
to build ences were noted in their tendencies 
Practice REA with the test materials. The 
another į placing the stimuli one on top of 
in their sortings was not considered 


Patte Sat 
Weve building in the foregoing analysis; 
r, this type of performance did differ- 
6 of the control 


entia 
a the two groups, for 1 
id Bie re nly 8 of the prain-injured group 
8toup wh ere was only one S in the control 
Stimuli o neither built patterns nor piled the 
E on top of the other, while in the 
thes ental group four Ss did neither of 
€ things. 
ete eta seem oppo 
Casu ting that while the 
e H throws” (3) the form 
Pa angen s, the brain-injured patient’ 
tra, Ment of the figures cannot 
ldstej 1s discrepancy between the results of 
ve ig in’s study and those of the present in- 
i en brings to mind the question raise 
ct on b, who maintained that “One must ob- 
“ls on logical grounds to Goldstein’s empha- 
au 2 ases which support his interpretations 
Pose (gy regard for those which ate op- 
We : 
mt ` as ler Memory Scale. The perfor! 
Practic i groups of Ss on this scal 
Wotient ”, identical. The average memory- 
n >an Obtained by the prain-injured Ss was 
S70 that obtained by the controls was 
7Otient difference of only 0.2. The memory- 
naa i for the brain-injured ranged betwee? 
ada 26, and for the uninjured petween 6 
+ Similar findings resulted from 


sed to Gold- 
uninjured 

s into differ- 
s spatial 
be arbi- 


Ste 


mances 
e were 
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analysis of the visual-reproduction subtest of 
this scale. On this subtest, the brain-damaged 
patients averaged 8.87, and the controls 0.07 
—again only a 0.2 difference. j 
The above results would suggest either that 
these two groups of Ss were equal in memo: 
ability, or that memory-quotients Iane 
from this scale are not valid indices of this 
ability. This latter possibility seems more 
plausible, since the intracranial lesions—vari- 
ously located and, in some instances, quite 
extensive—in the brain-injured Ss would very 
probably have affected memory ability in this 
group. F urthermore, Eysenck and H. Halstead 
(2) found that abilities involved in clinical 
tests of “memory,” such as those included in 
the Wechsler Memory Scale, were the same 
as those involved in intelligence tests. If this 
is true, then the two groups in the present 
study would be expected to score equally well 
on “memory” tests, since they were found to 


be equal in mental ages. 


Summary 

The purpose of the present study was to 
compare the results of brain-injured Ss with 
the results of uninjured Ss on some commonly 
employed tests for organicity in order to in- 
vestigate the diagnostic acuity of these tech- 
, Sixty hospitalized patients were ex- 
amined, half having brain injuries of fairly 
t origin, and half revealing no evidence 
ical involvement. Using the Ship- 
t Scale, the Weigl-Gold- 
heerer Color-Form Sorting Test, the 
plock-designs, and the 
Scale, only the Wechsler- 
t significantly dif- 
It was made ap- 
d as aids in 


ley- 
stein-SC 
Wechsler-Bellevue 
Wechsler Memory 
Bellevue plock-design subtes 
ferentiated the two groups. 

arent that 
diagnosing 
no benefit in t 
mber 19, 1956. 


brain damage ma 
his respect. 
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The Relation of the Archimedes Spiral Aftereffect 
and the Trail Making Test to Brain 
Damage in Children’ 


Anthony Davids, Louis Goldenberg, 
Brown University and Emma Pendleton Bradley Home 


and Maurice W. Laufer 
Emma Pendleton Bradley Home 


ae Present study represents an attempt to 
o Be re further the clinical utility of two tests 
i haeed that have recently been reported 
an E Journal. Specifically, this experiment is 
Pharma: t to extend and to relate the studies 
Rut’ by Price and Deabler (4) and by 
an (5), 

stice and Deabler reported that “Organic 
erentiaten Cortical involvement can be dif- 
9 ee from nonorganic with high degree 
techni ity by means of the spiral aftereffect 
experia (4, p. 302). Results of another 
esti ment, based on the work of these in- 
Stuge trs, led Gallese to conclude, “This 
Spira] erens with Price and Deabler that the 
tween est is of value in differentiating be- 
ang Patients with organic brain damage 
Denge S€ Without” (2, p. 257). In an inde- 
taineq ,, P2Per, Reitan presented results ob- 
that « With the Trail Making Test, suggesting 
+. this short, inexpensive, and easily 


1 
We w: 
t : radian to express our gratitude to the staff of 
ake in the Home for their assistance and coopera- 
Be Extend, execution of this research. Appreciation is 
iVi ence ed to the following members of the East 
` Vailabie oo"! department who made their facili- 
Mr Soup: © to us and provided the Ss for the nor- 
ite Otho aM Edward E. Martin, superintendent; 
likey ea ee assistant superintendent; and Mrs. 
Wise ext tite, psychologist. Sincere appreciation is 
inte? Dad to Mr. Raymond Holden and Miss 
cst an ple or the Meeting Street School for their 
fo Wen, ang sistance in testing the cerebral palsy 
Teg ovign’ We Wish to thank Dr. Ralph M. Reitan 
i "& us with copies of the Trail Making 


administered test may be a fairly valid indi- 
cator of certain effects of brain damage” (5, 
p. 394). These studies were concerned with 
organic damage in adult patients. In the pres- 
ent study, these two tests were brought to 
bear upon the problem of differentiating be- 
tween children with cortical damage and chil- 
dren without known cortical impairment. 


Method 


Apparatus and test material. The spiral 
aftereffect test employed an electric motor at- 
tached to a shaft to produce the necessary 
rotation. The motor was housed in a small 
wooden box, painted dull black, with the shaft 
protruding through an opening in the front of 
the box. An Archimedes spiral of 920° or 24 
circuits about the center was painted black 
against a white background and was glued on 
a disk of white plastic 8 in. in diameter. A 
mirror image of this spiral was also con- 
structed and glued on a similar plastic disk in 
order to give the effect of opposite rotation, 
Thus, two spirals were constructed, Spiral A 
which gives normal Ss a negative aftereffect 
of expansion, and Spiral B which gives a 
negative aftereffect of contraction. The disks 
on which these spirals were mounted were at- 
tached, interchangeably, to the shaft protrud- 
ing from the black box and were turned by 
the motor at approximately 78 revolutions 
per minute. , J 

The Trail Making Test used in this experi- 
ment was adapted by Reitan from one of the 
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performance subtests of the Army Individual 
Test (6) that has also been published asa 
separate test by Partington and Leiter (3). It 
is divided into two parts, each consisting of 
one sheet of 84 by 11 paper containing sam- 
ple material on one side and the test material 
on the other side. Part A consists of 15 circles 
distributed over the entire page and numbered 
1 to 15. Part B consists of 15 circles, eight 
numbered from 1 to 8, and seven lettered 
from A to G. The only difference between this 
children’s form of the Trail Making Test and 
the adult form used by Reitan (5) is that 


each part of the latter form of the test con- 
tained 25 circles. 

Subjects. Three groups of Ss were employed 
in this study. One group consisted of 15 chil- 
dren from the Meeting Street School in Proyi- 
dence, Rhode Island, which is a special school 


ring from cerebral 


wn cortical 
damage. The 0 Posed of seven 
boys and eight girls, with a mean age of 10.4 


posed of 23 boys and 6 girls, with a mean age 
of 10.3 years and a mean IQ of 93, as am 
ured by the WISC. A thir X 


d group consi t 
24 normal children from the East Poin 
Rhode Island public school System. These 


children had no known history of Organic im- 
pairment. This group was Composed of 12 
boys and 12 girls, with a mean age of 10.5 
years and a mean IQ of 97, as measured 
by the California Test of Mental Maturity 
(Short Form). 

Procedure. Individual administr. 
two tests required about 20 mi 
and was conducted during the 
room with good illumination. Th 
the testing session was devoted 


ation of the 
nutes per S, 
daytime in a 
e first part of 
to the spiral 
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aftereffect test. The procedure and int 
tions were very similar to those esr A 
detail by Price and Deabler (4). spie a 
giving normal Ss a negative aftereffect R a 
pansion, was presented first in half K i 
cases. While the disk was turning, the H 
asked, “What does the line appear to be a 
ing?” As soon as the disk was stopped, Soa 
30 seconds of rotation, the S was asked, a eed 
what does the line appear to be doing? wid 
the answer was recorded. The a E 
repĉated with Spiral B, giving en ae 
negatiy œ °ftereffect of contraction. ration 
aepetitions of presenta 


. whereas 
A thert ma let's Ss ntal of 6 triats; 4 trials. 
Prig. 3 Dea tion ovsceived only after- 
Each ogrect PerCeP was¥ the negative afte 


hile 
e mal I ñ, scored one, W as 
ran P perceive the negative after 
scored zero: Doubtful or equivoca al scores 
were scono 4 and, for all Ss, fraction mputs 
were ;‘dised to whole scores in final a rang 
tions. Thus, the Ss received total score 
ing from 0 to 6. jon W3 
The second part of the testing se aking 
devoted to administering the Trail onnect 
Test. On Part A the S was required to C09 gs- 
the circles with a pencil line as quickly jn 0 
sible, beginning with 1 and proceeding jois 
merical sequence. Then the S was uired H 
tered Part B, on which he was a pun 
Connect the circles, alternating ae asce” g 
bers and letters, taking both series he ): % 
ing sequence (i.e., 1 to A to 2 to B, rror Wi 
both parts of the test, whenever an p e jatet 
made the examiner pointed it out mE cor 
and requested the S to make the sats of 
rection. Since scores for the two Pat ir T 
test were the number of seconds ret at d A 
completion of the task, errors contr) ne va 
Scores to the extent that additional in 
needed for corrections. The raw ae 
onds Were used for statistical analyse 


Results en 
According to the ż test, there were 
ficant age or IQ differences amia 
8toups. A point biserial 7 was CoE 
tween IQ and spiral aftereffect S00. Sents 
each of the groups, yielding cor ie 
+ .27 (normal group), + .29 (orbs 
and + .17 (psychiatric group). 


(o oe 
t e 


ni 


of 


> = ` 


(E 


Tests of Brain Damage in Children 


ee tions are statistically significant, indi- 
Q Mg no relation between spiral scores and 


ies aftereffect scores obtained by Ss in 
centayy a three groups, together with per- 
ti Se computations, are shown in Table 1. 
Sin readily apparent that there are pro- 
squa ed differences among the groups. A chi- 
a Fe test of association, relating the nor- 
fee organic, and psychiatric categories to 
(failu Scoring categories on the spiral test 
a val re, partial, or complete success), yielded 
the £ Of 26.76 which is significant beyond 
© .001 level, 
for the mean spiral scores and the variances 
ate pe ormal, organic, and psychiatric groups 
with ented in Table 2. It is evident that 
and th e normal Ss there is little variability 
e © mean score is close to a perfect score. 
on +PSYchiatric cases are quite variable, but 
better or Tage this group did significantly 
Bro an did the organic group. This latter 
siderap 0 Sanic) not only demonstrated con- 
Scopa | © Variability, but in general their mean 
the „p 2dicated a definite failure to perceive 
Attereffect, 
the sid Jet us consider results obtained with 
rail Making Test. For each of the three 
Come? Product-moment correlations were 
ie between IQ scores and scores ob- 
Signific on Trail A and on Trail B. The only 
aor Spe Coefficient was found within the 
telatig 8roup, where there is a significant cor- 
b. The etween IQ and performance on Trail 
© magnitude of this relation (— .60) 


umb Table 1 
Stand Percentage of Each Group Achieving 
Each Spiral Aftereffect Score 


Ro 
Agel Group 
ery 
3 . 
sect Normal Psychiatric Organic 
re 
S n % n 9% n % 
Gear) 20 33 18 62 2 13 
3 1 4 Dui Onno 
2 Lua 3 10 Daki 
1 1 4 or 0. 1 7 
oo On 0 
k None) oo 2 7 0 0 
Ing 4 14 10 67 
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Table 2 


Comparison of Spiral Aftereffect Scores Obtained 
by the Three Groups 


Group Mean Variance 
Normal 3:9: 1.9 
Psychiatric 4.5 5.3 
Organic 1.4 5.1 

Groups Compared F t 
Normal vs. Organic 2.68* 6.70** 
Normal vs. Psychiatric 2.79** 2.10* 

1.04 4.21** 


Psychiatric vs. Organic 


* Significant beyond .05 level. 
** Significant beyond .01 level. 


suggests that with further refinement this test 
might be used as a brief and easily adminis- 
tered measure of intelligence in normal chil- 
dren. Although all of the other coefficients are 
in the direction indicating a negative relation 
between intelligence and time required to com- 
plete the trails, none of these coefficients are 
significant. Thus, it must be concluded that 
within the organic and the psychiatric groups 
there is no significant relation between IQ and 
trail scores. 

Mean scores and variances obtained on both 
parts of this test by each of the three groups 
and the significance of these differences are 
shown in Table 3. These findings indicate 
quite vividly that the normal Ss are less vari- 
able and perform significantly better on both 
parts of this test than do the other two 
groups.* On Part A, the psychiatric Ss per- 
form significantly better than do the organic 
Ss, but on Part B there is no difference be- 
tween these two groups of patients. 

Finally, results on the two independent tests 
were related by means of chi-square analyses 
in order to see if they were associated signifi- 
cantly. Within both the organic group and the 
normal group it was impossible to form an 
adequate dichotomy on the basis of spiral 


2 The significance of the differences between these 
groups was computed. using the formulae and pro- 
cedures described by Walker and Lev (7, p. 157), 
which take account of the fact that the variances are 
heterogeneous. These highly significant differences be- 
tween the means were also obtained employing the 
method suggested by Cochran and Cox (1) for use 
when there are significant differences in the variance 


of responses. 
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Table 3 


Comparison of Trail Making Test Scores (in seconds) Obtained by the Three Groups 


Trail A Trail B 
ps ere : 

Group Mean Variance Mean ‘Variance 
Normal 19 36 a ae 
Psychiatric 25 75 83 Sais 
Organic 32 163 84 

t 
Groups Compared F t F zoin 
Normal vs. Organic 4.53*+ 3.61** 7.27** 3.19** 
Normal vs. Psychiatric 2.08* 2.96** 12.80** 34 
Psychiatric vs. Organic 2.17* 1.91* 1.76 É 


* Significant beyond .05 level, 
** Significant beyond 01 level. 


cant association (X? 


z 07 
pairment, and that there is significant as5 
Ciation between the results derived oa 
two tests. These findings suggest that motel 
these instruments possess considerable Pi or- 
tiality as valid procedures for diagnosis 
ganic impairment in children. the 

In considering the association bever j 
two tests it is interesting to note a not 
the psychiatric group Trail A Scores 2 scores 
related to spiral scores, while Trail 


h of 


ing sus 
=3.70, p= 027) be- were rel, F This finding 5 
me ated t 1 scores. This ‘ple to 
tween the Psychiatric Ss? Performance on this gests that eat mee be more susceptible A 
test and their performance on the spiral test. impulsivity and that merely connecting 2 
Moreover, When the three groups were com- ered circles as rapidly as possible does A 
bined, a coefficient of 9.61 $ =.01) was apy me 


obtained, indicating that for the overall het- 
erogeneous group, consisting of Normals, or- 
ganics, and psychiatric patients, there was a 
significant degree of association between $s 
performance on these two tests of organic im- 
pairment. 


Discussion 


The results of this inv 


Provide a measure of organic en 
That is, it is possible that the SS cogni 
rushed through the task which ae nd 
tively very simple and quite cone ert 

that there was no relation to their A pich 
ance on a test such as the spiral tes ulsivitY 
Was probably less susceptible to T olveme 4 
and more a measure of cortical invo ver, th? 
When confronted with Trail B, howe eces” 


task a mplex and the * ers 
firm those reported by Deabler and Price, sity k ie apap en between P strat 
Gallese, and Reitan. The present study is and letters nt well have been ee > 
actually an extension of these studies, and enough to ptor da a measure of an any 
demonstrated the applicability of these tests pairment rather than of impulsivity., ould 
for use with children. A further contribution rate, on the basis of these findings, ai e 
of this research is the fact that both tests seem that Trail B is probably a a simp” 
which had previously been employed inde- ure of organicity than is the more js 
pendently, were used in a unified experiment. Trail A test. F ey Ei 
Thus, all in all, it was found that the findings It is well known that with soreo 
of previous investigators could be replicated turbed children contradictory and ‘rom i 
and extended to children with organic im- i 


ent findings are frequently obtain 


cr 
a a 
a aa 


| 
| 
| 
| 


Tests of Brain Damage in Children 


ae Conventional physiological and psy- 
ing “pee of organic impairment. Work- 
| tial ae ildren who are undergoing residen- 
= Studies ae we plan to conduct additional 
Standin a ie attempt to gain further under- 
methods of the validity of these assessment 
tional d and of the role of organicity in emo- 
Isturbances of childhood. 


Summary 


Piral aftereffect test and the Trail 
E est were administered to a group 
din mal children, a group of emotionally 
d children, and a group of children 
from cortical damage. These two 
isl partly in unrelated studies, had 
adults on. een found to differentiate between 
age, I and without organic brain dam- 
found ¢ the present study both tests were 
4 the Ser Teveal significant differences among 

PS, as predicted, and both appear to 


The 
Making 


a 

rs 
eee 

o 
© 


e 
S, used 
i Previo 
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possess considerable promise as valid methods 
for assessing cortical impairment. 
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The Tendency of the Dörken and Kral Brain Damage 
Measure to Score False Positives! 


Harry L. Saslow and William G. 


Staunton Clinic, 


From neurological principles and 190 case 
records, Dörken and Kral (1) derived a novel 
scoring system whereby organicity is indicated 
by the absence of common Rorschach scores, 
Their “Organic Deficit Rating” 
not discriminated as 


« Intelligence was 
mple was about 


son correlation of .96, 
Forty-eight of the 108 
satisfied the ODR criteria 
medical records of these cases were checked 
by board Psychiatrists to see if the patients 
ad a Previously undetected organic condj- 
tion. Two organics and three mental defec- 


nonorganics (44%) 
for organicity. The 


report of this study may be ob- 
tained without charge from W. G. Shipman, Staun- 


University of Pittsburgh Medical 


Pa, or for a fee from the American 
Documentation 


Temitting $1.25 
copies, 


Shipman 


University of Pittsburgh Medical School 


” ion 
tives were thus found. With this com 
42 cases (39%) were mi ene A by 
ODR. This parallels the findings ven per 
others and stands in contrast to the e Dörken 
cent inaccuracy found originally by | 
and Kral. arl- 

The scale seems very sensitive to peo 
able of intelligence since a Taema echsler- 
tion of .52 was found between it and with 105 
Bellevue IQ. Two-thirds of the cases x re 
below 100 obtained an organic Te 
were no false positives among the 
an IQ above 115. 

Since anxiety has been shown ODA tenc 
Rorschach performance and the 5 organics 
to label such constricted persona hip. TP 
a check was made on this raa Es diag 
mention of anxiety in the full psyc jortion | 
nosis, however, did not apply propi c than t° 
to more of those classified organi positives 
those classified nonorganic. ape A he 

id not cluster significantly in ie categorie? 
Medical or psychiatric diagnosti 
applied to the patients. sA 

In Simmnary the 39 per cent miS 


onstrict 


ac 


Jassific f 


. : nicity sont 
tion of functionals by this orga tpatie” 


ae a ou 
SUggests its doubtful utility in an 
Setting, 


Brief Report, 
Received June 5, 1957. 
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R , 
and ies studies by Price and Deabler (3) 
Y Page et al, (2) provide information 
t 


| 


Con « 
Spiral aS the remarkable sensitivity of the 
tection of aa as an instrument for the de- 
figures aaa brain damage. However, the 
not provi are reported in these studies do 
Which th © evidence for the frequency with 
Expect o practicing clinical psychologist can 
of Organ ifferentiate correctly between cases 
Otderg 1c brain damage and functional dis- 
Cffect a pepe the report of the spiral after- 
thi ; aA a diagnostic criterion. The reason for 
1) ie *mply illustrated by Meehl and Rosen 
at these investigations have not con- 
© frequency of organic brain dam- 
eeh (“base rate” in the terminology of 
d Rosen) in the hospital populations 
Menta ee they have drawn their experi- 
ules. It is the purpose of this pa- 
f OW the effects of estimates of these 
ei tia a the proportion of correct dif- 
Ven lagnoses which can be anticipated, 
tion me ensitivities reported in the investi- 
di ee above. In addition, certain 
Scusseq Plications of these findings will be 


th 


h 
X e i 
ates experiments to be discussed, base 
Soy} ere e 


x 


iny, Ons of Stimated in accord with the de- 
“Stigator the subject samples given by the 
be esident Also, the rates were estimated 
| the the q Patients, as this appeared to fit 
| Se Stu ccription of the subjects used in 
o €s. Since the experiments were Im- 
evaluate a diagnostic tool, first ad- 
bri a would have been a more 
Wath a Stoup from which to sample- 
Noir a ce wish to thank Dr. M. Kershaw 


a criti 
t maal reading of the manuscript. 
| niversity of Colorado. 


Base Rate and the Archimedes Spiral Illusion’ 


Donald W. Stilson, Malcolm D. Gynther, and Boris Gertz 
South Carolina State Hospital and the University of Soutk Carolina 


Base Rates and Previous Experimental 
Findings 


Using Price and Deabler’s description (3) 
of their organic population, and referring to 
the results of the annual census of mental pa- 
tients conducted by the Public Health Service 
in 1951 (4), it is possible to estimate that 
about 16% ° of the total VA hospital resident 
population will be included among the cate- 
gories listed by these investigators. This value, 
16%, provides an estimate of the organic base 
rate for the population from which Price and 
Deabler obtained their experimental sample. 

The base rate implies that if all the patients 
in this population were diagnosed as “nonor- 
ganic” without any test information whatever, 
then the diagnosis would be incorrect in 16% 

cases. 
te and Deabler found that 98% of their 
organic group and only 5% of their nonor- 
ganic patient group failed to report the spiral 
aftereffect on four out of four trials. As Page 
et al. did not use any nonpatient subjects, the 
normal group of Price and Deabler is not in- 
cluded in order that the analyses of the two 
studies might be comparable. Their patient 
sample included 75% known organics and 
25% nonorganics, and the aftereffect led to 
a correct diagnosis for 156 out of 160 cases 
or 2.5% incorrect diagnoses. If their sample 
had contained only 16% organic patients (as 
would be more nearly the case in actual use 
ee that their base 

ie sea a aia From published 
ee 4), it is difficult to decide exactly which 
Pe CE auns should be included under the label 
e ai » Although Meehl and Rosen (1) assert that 
ied ply a matter of com- 


mining base rates is ; : 
putation decisions must be made with ee E 
Pans ‘and inclusions if one is to- select the ap. 


priate rate. 
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of the spiral aftereffect in a VA hospital popu- 
lation), the total number of incorrect diag- 
noses would have been approximately 4.5%. 
This percentage was obtained by noting that 
of the 16% organics in that population, 98% 
would be detected by failure to report the 
aftereffect on all four trials, whereas, of the 
84% nonorganics in the population, only 5% 
would fail to report the illusion, Of course 
use of the aftereffect stil] Tepresents a sub- 
stantial reduction in the percentage of errors 
over what would be obtained using only the 
base rates for diagnosis, ie, a 16.0 — 4.5 = 
11.5% reduction in incorrect diagnoses, i: 
In the study of Price and Deabler, ; 


2 age 
organic sample was so identified, eat 


i is di 4 
since 23% of ifficult to determine accurately 
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frontal lobotomy cases for whom no fre- 
quency figures are given in the census (4): 
The best estimate of the base rate for the 
Page et al. study appears to be .13. This fig- 
ure differs from the rate estimated for the 
Price and Deabler study for two reasons: (a) 
Page et al. used different categories of of 
ganics than did the earlier investigators, an 
(b) the rates for diagnostic classes are dif- 
ferent for the two populations sampled (¢.8» 
cerebral arteriosclerotics compose .064 of H 
state hospital population and only .036 of the 
VA hospital population). If no test had De 
employed and all the patients in this poni 
lation had been diagnosed “nonorganic, m 
Procedure would lead to incorrect diagnoses i 
13% of the cases. Using the 40% orga 
and 85% nonorganic subjects’ reporting Ue 
aftereffect on at least one trial (2) as 4 Ke 
for making the diagnosis, the “misses” C4 u 

lated using the 13% organic base rate hee 


h, 
be 18%. In this instance, use of the Ac 
Medes spiral results in an increase in iag 
(i.e., 18.0 — 13.0 = 5.0% more incorrect Of 
noses) as compared to predictions from 
base rate alone! 


General Implications 


te cleat 
From the preceding discussion, a pe e 
that at least two kinds of diagnostic first 
can be estimated for a given tèst. a or” 
which will be called sensitivity, is the fc clas 
tion of cases of a particular diagno’ es a? 
(eg., “organics”) which the test ines 
at’ 
nv 


Members of this class. The second, ‘ a 
nation, is the proportion of all cases diag’ 
“cular population which are correc ated t 
nosed by the test. Sensitivity is unr isc, 
base rates in a population whereas ra 
nation depends greatly on the base 
the Populations being considered. E c 
studies cited here give information ec 
Ing the sensitivity of the spiral ae p i 
detecting organic and nonorganic ape si 
Using these sensitivity figures, this P 
Pointed out the discrepancy b 
tivity and discrimination of the: 3 r 
effect for patient populations sim ot 
sampled by the two studies. inatio® ii h 
In a practical situation, discrimi gi 

sensitivity, is of most concern. 
Psychologist in a mental hospita 


of 


tbe 
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diagnoses within a particular population and 
the adequacy of any diagnostic instrument he 
uses must be determined with the base rates 
for that Population taken into account. 

For the practicing clinician, an effective 
Psychological test for organicity must be able 
o distinguish between functional disorders 
and members of the organic group who pre- 
Sent diagnostic problems which cannot be re- 
plied better by means of neurological, bio- 
oleae Physiological, or other psychologi- 
dl €xaminations, If organic patients who are 

poly diagnosable by other methods (e.g., 
Paretics) are eliminated from a state hospital 
pebulation, approximately 6.1% of the re- 

ander will be organics and will be among 
a Ose referred for differential diagnosis. Thus, 

-88nosis of “nonorganic” for all of these 
Patients wil] be incorrect in 6.1% of the cases. 

seq eusitivities reported by Page et al, are 

ia diagnosis within this population, then 
1042” Of the diagnoses would be incorrect, a 
high” increase in diagnostic errors! With the 

a r sensitivities reported by Price and 
be 4 Cr, the proportions of errors still wo 
Etrorg %, Which is nearly as many as the 6.1% 

Odtained by merely diagnosing all pa- 


tie 
i “nonorganic,” These results demonstrate 
base «ally the critical role of population 


e ra ae a ve 
Nostic oe In judging the usefulness of a diag 


Mne ‘tne, teport of the aftereffect on at least 
Rongy, „as the criterion for a diagnosis of 
tedes cuicity , the sensitivity of the Archi- 
Bani Spiral was found to be greater for non- 
ihe P io than for organics in both studies. In 
toi ae and Deabler study, it was possible 
SUbjectg Correctly 100% of the nonorganic 
et aye Using this criterion, and in the Page 
Correr “Xperiment, 85% of nonorganics were 
0% = classified. On the other hand, only 

€ organic subjects were so identified 
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in both studies. The implication of these find- 
ings is that the test apparently can be used 
with considerable confidence to diagnose zon- 
organics. This emphasizes the need for devel- 
oping a test with a high sensitivity for or- 
ganics to be used in conjunction with the 
Archimedes spiral. 


Conclusions 


This paper distinguishes between evaluating 
the sensitivity and the discrimination of a 
diagnostic instrument. It shows the critical 
role played by population base rates in de- 
termining the discrimination of a diagnostic 
test, even when the sensitivities for the test 
are very high. Examination of the base rates 
may show that basing diagnostic decisions on 
a greater amount of valid information may 
increase the number of incorrect diagnoses 
that are made. Although this paper has fo- 
cused on the diagnosis of organics with the 
Archimedes spiral, the implications are of gen- 
eral import for the use of all diagnostic in- 


struments. 
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PSYCHOLOGICAL 
TEST \REYIEWS 


New Tests 
Martin M. Supervisory Practices Test. Adult. 
1 form, Untimed, (20) min. Test booklet (20¢) ; 
key (25¢) ; manual, pp. 7 ($1.00) ; specimen set 


($1.00). New Rochelle, N, Y, (71 Hanson Lane): 
Author, 1957, 


The Supervisory Practices T, 
tiple-choice items whi i 


of groups, 
and th 


—L. Fi S. 
Cattell, Raymond B. The I. P. 4. As Anxiety Scale 
(“Self Analysis Form”). College-adul 


t. 1 form, Un- 
timed, (10) min. Questionnaire booklet ($3.00 per 


25); key (40¢) ; handbook, pp. 21 ($1.00) ; Speci- 


men set ($2.00), Champaign, Ill.: Institute for Per- 
sonality and Ability Testing, 1957, 


4 eness); 
anxiety factor—12 items from O ue o! 
10 from Q, (id pressure), 8 from Qs Keer and 4 
integration), 6 from C (—) (ego wea ber of items 
from L (paranoid insecurity). The aeaiia weig 
from each Primary approximates its oT ale Scale, 
in the general factor of anxiety. The + validity ob- 
therefore, rests mainly on me construc a 
tained from the factor anal lyses. re is Fe 

There are also empirical brs mE" 
Ported to have high correlation with iety. It sepa- 
test factor (U. I. 24) identified BA ait three” 
rates normals from anxiety hysterics: bout three 
fourths of normals fall below, and “score RIES. 
fourths of anxiety cases above, a e total a 
The corrected split-half reliability of 1, varying WÍ 
iety score is reported as from .84 to .91, 
the dispersion of the sample. 

In addition to the total score, ape 2 
are suggested by the manual. The “ee secon est 
judged more “cryptic? (subtle) and ratio 0 bare 
more symptomatic or obvious. The rt-cover rare 
two scores is Suggested as an- ON anes is aw 
showing the degree to which thé exa howevels rch. 
of his anxiety. Unlike the total ogee reset he 
ratio is supported by no evidence lication #5 

he concept is interesting but its ed for eac? "to 
ardous. A score may also be a’ items. g 
mary factor represented in the i from .26 t° 

revity, the part reliabilities are on yi 
and their interpretation is discourag sd 

Vorms for the total scores are ba but un 
and 313 women, of whom a oiee. ae correc 
number are college students. slag defendet jt 
for sex and age are stated, but are 29 reference 
adequate data. The manual ET. Anxiety pst 
Seems evident that the I. P. A. a er current 3 
a Sounder conceptual base than oth tional Pr 
ments of its type. Many of the eA st ful ae 
of its scores remain to be establita forthco™ 
Search, which will almost surely 
L. BS, 


e5 
1 special sA 
o items ‘o 


en 
82 m 

y specified 

tion? 


438 


) 


4 


. 


Journal s 
Vol, KE onsulting Psychology 


The Wittenborn Psychiatric Syndromes: 
An Oblique Rotation 
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Veterans Administration 1 


There 
sy ™ptom 
Patients a 


IS now substantial evidence that 
syndromes among mental hospital 
Ors ADe relatively stable. The same fac- 
a d patient, despite differences in hospitals 
Taters (8 S sampled (1, 3, 7), differences in 

» and differences in rating schedules 
agtcement ED there is by no means complete 
SAgteem in this domain. To some extent 
seni) wuts are due to the failure to include 
ample references or marker variables. For 
Not a? & compulsive-phobic syndrome can- 


Suitaly Hia in a study in the absence of a 
Corp {umber of defining variables in the 
Casi, O00 matrix. Other differences, more 


tetica] "econciled, have their origin in theo- 
Ntilize a lāses. Some workers systematically 
tation ĉn orthogonal framework for their ro- 
ftame ’ p OSt others use oblique or correlated 


It has reference, 
nitive not been uncommon in studies of 
blique and perceptual abilities to offer both 


lgi ad Orthogonal rotational solutions 
a Gee body of data. The degree of fit to 
reas ar the plausibility of inferred con- 
colts’ nd the extent of consistency with the 
taj Pared Other investigators may then be 
mee en u Would seem worthwhile to ob- 
aè im “native solutions for a few of the 

nilah] Portant psychiatric symptom studies 
the purpose of securing greater 
and consistency from study to 


A ton A a. Thése authors, using an ortho- 
‘ © Veterans Benefits Office, Washington, 


gonal frame of reference, identified a schizo- 
phrenic excitement factor which has not been 
found in other similar investigations. On the 
other hand, they failed to isolate a parameter 
of thinking disorganization or dissociation 
which has appeared in several analyses re- 
ported by Degan (1) and Lorr, Jenkins, and 
O’Connor (3). It is hypothesized that rotation 
of the Wittenborn-Holzberg data to an 
oblique frame of reference will result in the 
disappearance of the Schizophrenic Excite- 
ment factor, reveal a factor of thinking dis- 
organization, and improve the factor structure 
generally. 

~ The Wittenborn-Holzberg data are chosen 
for reanalysis because they were collected 
under controlled conditions, are comparable 
to other large investigations, and have been 
properly rotated. Psychiatrists rated 250 
newly admitted patients to a Connecticut 
state hospital over a six-month period on the 
basis of observations made during a standard 
period when no patient was under treatment. 
Alcoholic, senile, sclerotic, paretic, and psy- 
chopathic patients were excluded. 


Procedure 


The orthogonally rotated factor matrix re- 
ported (9) was rerotated blind by means of 
the single plane method with the object of 
attaining oblique simple structure. The cor- 
relations between the primary vectors result- 
ing from this analysis were then in turn fac- 
tored by the multiple group method, and ro- 
tated in accordance with the usual eien 
Finally, by means of a suitable me 
tion, the correlations of the scales wit. 
of the second-order factors . were obtaine ; 
These correlations provide a much superior 
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Table 1 


Rotated Oblique Factor Matrix V 


(Decimal points are omitted) 


Scale Description 


š 


Difficulty in sleeping 
Rapid change of ideas 
Unjustified sexual beliefs 
Obsessional thinking 
Unrealistic self blame 
Gives in easily to others 
Restless 
Little concern for others 
Use made of physical disease symptoms 
Eats very little 
Impudent or impolite 
Expresses irritation 
Avoids people 
Talks loudly 
Behavior affected by phobias 
Slovenly and unkempt 
Engrossed in plans 
Feelings of impending misfortune 
Conspicuously optimistic 
ifficulty carrying out plans 
Doubts he can be helped 
Concerned with impression made on others 
Thinking bizarre or obscure 
Little organic basis for complaints 
eels conspired against 
Feels others control his behavior 
Acutely distressed by anxiety 
Organic pathology with emotional basis 
oncerned with orderliness 
Attention-demanding 
Overt activity slowed or delayed 
Grandiose notions 
Believes he has no Psychological problem 
Exhibits compulsive acts 
Rate of speech variable 
Belligerent or combative 
Mood changes frequent 
Suicidal thoughts or impulses 
ailures of affective response 
Little concern over physical h; 
Difficulty in making decisions 
Distorts facts to defend opinions 
as hallucinatory experiences 
ife history memory poor 
Fears committing abhorred act 
Words irrelevant to recognizable ideas 
Anxiety affects task performance 


andicaps 


Forgets earlier insights 

Speech is stilted 
‘xaggerated affect: 
eluctant to confi 


ive responses 
orm or cooperate 


CONIA PwWHe 


Factors 
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b To l 
=. ac making inferences concerning the 
À nd-order factors than the correlations of 
rer Primary vectors with the second-order 
erence vectors. 


The First-Order Factors 


ere tPretation of each factor is made in 
With th Variables correlated = .30 or higher 
othery <e reference axis except where stated 
Wise, 
is ane ist factor, A, as shown in Table 1, 
p tenic f identical with the Paranoid Schizo- 
e tengs 0T Originally identified (9). Only 
admissions >, to repudiate earlier insights or 
essential] S (No. 50) is missing. Factor A is 
tion beer se as the Paranoid Proen: 
y Si 3. 
Connor (3). ated by Lorr, Jenkins, an 


Phreniç Binal Wittenborn-Holzberg Schizo- 
Scale ya; sitement factor is defined by 16 
higher Mables with factor loadings of .35 or 
factor 4 ese variables, in order of size of 
COoDerat, ading, are as follows: reluctance to 
em fo €, slovenly and unkempt, little con- 
> er, ers, combative, difficulty in sleep- 
Pang a, o8 little, difficulty in carrying out 
Deech pa Making decisions, variable rate of 
range relevant speech, poor memory, rapid 
ving Of ideas, restless, impudent, denies 
ai tesizeg. em and is not orderly. As hy- 
minag,” the oblique Factor B isolated here 
af Rio. 1°. all of the elements of excitement. 
ree of p Cned by the following in order 
sig Bhizap tt loading: words irrelevant to 
nS, fail lè ideas, difficulty in making deci- 
Car oe Of affective response, difficulty 
het acter! Plans, little concern for others, 
yee ae Slowed or delayed, life histo 
tey t po Voids people, and slovenly A 
tig, Anc to atte or obscure thinking and a 
Or correg Conform or cooperate also show 
"relations with B. i 
n of thesized, B represents a disorgani- 
Lement inking. However, it also includes 
ac) Social withdrawal. B resembles 
-Or of Conceptual Disorganization 
at SS Marked by irrelevant and inco- 


an 


variables included in the two studies com- 
pared is only partial. Guertin’s (2) Confused 
Withdrawal also resembles Factor B. 

Wittenborn and Holzberg label their third 
factor Manic-Depressed and their sixth factor 
Paranoid Condition. As a result of the oblique 
rotations, these two factors undergo consider- 
able change, and two excitement parameters 
emerge. A comparison of columns C and F 
reveals that the two factors have in common 
the following variables: rapid change of ideas, 
restless, talks loudly, rate of speech variable, 
mood changes frequent, and exaggerated af- 
fective response.* 

Factor C is distinguished from F by the 
following: belligerent or combative, reluctant 
to conform or cooperate, difficulty in sleeping, 
impudent or impolite, expresses irritation, 
slovenly and unkempt, does not give in easily 
to others, little concern for others, and eats 
very little. Implied here is an Excitement with 
a Hostile Belligerence. Degan’s Hyper-irrita- 
bility and Lorr e¢ al.’s Belligerence factors 
closely resemble Factor C but do not include 
the aspects of excitement to be found here. 

Factor F is distinguished from c by ex- 
ch as the following: con- 

` ly optimistic, engrossed in plans, at- 
A speech stilted, and gran- 
diose notions. The tendency to distort facts to 
defend opinions has a doubtful correlation of 
26 with Factor F. Otherwise, there is no evi- 
dence of any paranoid element in this factor. 
The parameter underlying F appears to be an 
Excitement with Expansiveness. F is most 
similar to and perhaps identical math ae a 
Manic Hyper-excitability which is degne by 
excitement, destructiveness, and eup. ane 

‘An examination of the variables showing 


ive correlations with F reveals a 
see aa le. The pole is charac- 


pansive features su 


ible depressive po 
A Follows: avoids people, eats very 
ale ctivity slowed or delayed, and 


P ä 1 
pie ere or impulses. This suggests 
fiat Factor F represents Wittenborn’s Manic- 
pa d factor with the hostile belligerent 
eee removed and depressed pole reduced 
e 


in importane®- D, is essentially the 


NA S] 
eat Yed 1 thought-feeling _ disharmony, fourth factor, 
lop On Suage, spee locking, slowea--~ hard L. Jenkins, M.D, in 
we Sth a p upa ind adidasa? sy. Raseariohi 4 o U “is gratefully ac- 

at 


e. se aa 
ly However, crbss'idettifleation is aile O Jida | 
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Table 2 


Correlation of Scales with Second Order Factors 


(Decimal points are omitted) 


Scale Description 


Difficulty in sleeping 
Rapid change of ideas 
Unjustified sexual beliefs 
Obsessional thinking 
Unrealistic self-blame 
Gives in easily to others 
Restless 
Little concern for others 
Use made of ph: 
Eats very little 
Impudent or impolite 
‘XPresses irritation 
voids people 
Talks loudly 
Behavior affected by phobias 
Slovenly and unkempt 


'ysical disease symptoms 


pending misfortune 
Conspicuously optimistic 


ifficulty carrying out plans 
Doubts he can be helped 


Feels conspired against 


Feels others Control his behavior 


Acutely distressed by anxiet 


Organic pathology with emotional basis 
Concerned with orderliness 


Attention-demanding 

Overt activity slowed or delayed 

Grandiose notions 

Believes he has no psychological Problem 
Exhibits compulsive acts 

Rate of speech variable 
Belligerent or combative 
Mood changes frequent 
Suicidal thoughts or impulses 
Failures of affective response 
Little concern over physical h; 
Difficulty in making decisions 
Distorts facts to defend opinions 
Has hallucinatory experiences 
Life history memory poor 


Fears Committing abhorred act 
Words irrelevant te 
Anxiety affects ta. 
Forgets earlier insights 

Speech is stilted 

“Xaggerated affective responses 


etuctant to conform or Cooperate 


andicaps 


Factors 
No. X Y . 
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an? »\Vittenborn’s Anxiety. It is charac- 
blame, f T the following: unrealistic self 
can i tal ing of impending doom, doubts he 
avoids elped, suicidal thoughts or impulses, 
tivity pearls, difficulty in sleeping, overt ac- 
task a am or delayed, and anxiety affects 
With D ormance. Factor D seems identical 

€gan’s Depression syndrome and Lorr, 


Jenki : 
tion ms, and O'Connor's Melancholy Agita- 


DE an 
or ' 
i ae study. Not only are the defining 
S identical but the factor loadings are 
e same in the two analyses. 


The Second-Order Factors 


The 
Othogonar ed second-order factors are nearly 
the fe The cosines of the angles between 
00a “S reference vectors are .04, .00, and 


able 2 is characterized by: unreal- 
lame, a feeling of hopelessness, ob- 
Dendy etts, disrupting phobias, feelings 
PUlsive ing misfortune, acute anxiety, com- 
acts, and suicidal thoughts or im- 
to iio here is a pathological intro- 
f Melan 6 Or self-directed hostility common 
Mules: Choly Agitation and the Phobic- 
PG reaction, 
Ug V tn ' best defined by irritability, loud 
at a istorti ence, belligerence or combative- 
to “ation q lon of facts to defend opinions, 
dos Oper: “manding behavior, and reluctance 
iq ten, ate. Other lesser elements are para- 
otter cies and an expansive optimism. 
ler fact 1S most similar to Degan’s second 
stn EXcit ania, and represents 4 condi- 
h lle fag. Ment sometimes expressed in a 
sive aan and sometimes in a grandiose 
ion, 

T Variable correlations with Factor 
seng “> Or less which implies that Z may 
residual plane. The irrelevant 


tor Mate € Print; i ; 
ays Sth ge ating Costs, the transformation mattis, 


d with the American Documenta- 
r Document No. 5363, remitting 
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speech, poor memory for life history, difficulty 
in making decisions, little concern for others, 
failure of affective response, slovenly and un- 
kempt appearance, and slowed overt activity 
all suggest a personality disorganization. 
However, the significance of this factor seems 
doubtful. 
Discussion 

The factor structure that emerges from the 
oblique rotations is rather well defined. Why 
then, is there not greater agreement with the 
findings of other investigators? First, it may 
be that additional meaningful factors could 
have been extracted. Others have isolated as 
many as 9 and 11 factors (3, 4). Incomplete 
factor extraction would tend to obscure both 
the first- and second-order factors identified. 
Another restriction is the extent of overlap in 
the variables represented in the various stud- 
ies reported. A third limitation lies in the ab- 
sence of essential reference variables to define 
factors likely to be present (an inference 
based on studies published subsequent to 
Wittenborn’s). The Wittenborn Psychiatric 
Rating Scales do not explicitly include man- 
neristic movements or postures, stereotyped 
words or phrases, or separate scales for visual 
and auditory hallucinations. In the absence of 
these distinctive reference variables to define 
a factor of Perceptual Distortion or of Motor 
Disturbance, no common patterns can ap- 
they are lost in the unique variance. 
to be some of the conditions re- 
ngs from this study. 


pear; 
These seem e 
stricting the findi 
Summary 

The orthogonally rotated Wittenborn-Holz- 
berg data descriptive of 250 is Soom pa- 
tients was re-rotated obliquely for the eu 
of (a) clarifying the factor structure an @ 
identifying any second-order factors : 
: ight be present. It was hypothesized that a 
ma of Schizophrenic Excitement would 
a and a factor of i a Disor- 

i identified. 

ganization Won er isolated, Paranoid 

eit ancholy Agitation, Conversion 
Phobic-Compulsive reaction, 
indistinguishable from those 
‘actors of Schizophre- 
pressed, and Para- 
aced by an Excite- 


Projection, Mel 


Hysteria, an 
=e found to be = 
originally isolated. T. ne _ 
nic Excitement, Manic- 
noid Condition were rep 
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ment with Hostile Belligerence, an Excite- 
ment with Expansiveness, and a factor of 
Thinking Disorganization. The three second- 
order factors isolated were interpreted as 
Morbid Intropunitiveness, Manic Excitement, 
and Personality Disorganization. 
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e Fis writer and his colleagues had 
Of ps o a series of factor analytic studies 
era sh latric symptoms (5, 6, 8, 9, 16), sev- 
plications were received about the possible 
ese o of oblique methods of rotation to 

o Pe aies, Accordingly, the set of oblique 
argest S that Lorr has published for our 
Particula most homogeneous sample is of 
( lisheg ar interest (1), Since we never pub- 
ique rotations for these studies, the 

be this opportunity to offer a few 
and ai comments concerning orthogonal 
*bligue ‘que rotations and to indicate why 
fm able p TOtations were not considered desir- 
Our analyses of the symptom rating 


Writ 
er 
Co; 


. The i 
Bstrum a shed series of factor analyses were 
| ative ‘st to the preparation of a quanti- 
tosis coure for the multiple descriptive 
te, 20lo 2 patients suffering from severe 
i Creste 8ical disorders (2, 20). We were in- 
ayy, E ae developing such a procedure to 
‘hing ,Stiteria scores whereby we might ex- 
ologi escriptive implications of certain 
tate cal tests and whereby we might also 
: ‘tea m mptom changes as a result of vari- 
NA Procedures (2, 5, 10, 11, 12, 
Provi, no suitable theory of the psy- 
E eee intervening variables which 
ns, ,,. 1S a basis for the desired quan- 
> We were obliged to rely upon 4e- 
encepts as a basis for our quanti- 
i te hoped that a limited ame 
i Concepts could be inferred in 
me ae that tig would have the fol- 
acteristics: 


Tey, stipe 


; . 

Ye Yasid describe the symptoms that psychi- 

‘y ulg portant in evaluating their patients. 

Aara? €scrj, © as few as possible in number = 

Set of Dively adequate summary for the 
ymptoms, 


| Rotational Procedures and Descriptive Inferences 


3. They would be plausible in nature and, if pos- 
sible, refer to existing descriptive concepts. 

4. Their relationships with each other would be 
comprehensible (all other things equal, orthogonally 
related concepts would be more comprehensible than 
intercorrelated or obliquely related concepts). 


In brief, we found that it was possible to 
build a practicable set of 55 symptom rating 
scales (20). The intercorrelations among the 
symptom rating scales could be adequately 
expressed in terms of nine independent factors 
(the residual correlations were not reliable or 
consistently interrelated). These nine inde- 
pendent factors could be orthogonally rotated 
to reveal symptom clusters which were remi- 
niscent of classical diagnostic concepts, in- 
cluding the manic-depressive bipolarity (4, 
6). Thus the inferred descriptive concepts not 
only had the advantage of being familiar 
(such plausibility should not be considered 
any more likely for orthogonal than inde- 
dent rotations, however), but they were 


en 
p each other. 


independent of AN 
Ta view of our purposes, it is easy to see 


why we were satisfied with our orthogonal 
rotations and chose to emphasize them in zA 
publications. The reader may be ae 
however, in the reasons which specifically e 
us to avoid the use of oblique rotations ın 


developing our procedure: ) 
1. Trying to describe patients 1n terms of 
several independent abstract pecs is 
difficult enough; trying to describe i em p 
terms of complesly interdependent ca le 
is likely to be discouraging to most practi 
i cholo who have little interest in quan- 
Pp 


itati ments. 

iai ad applied an oblique procedure to 
a f our studies and found the results less 

one ae sults of the orthogonal 


i igi n the re 1 
Lon ie ra is, the factors yielded by the 
pr : 
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i re not as familiar in our 
aoe e k = the orthogonal factors, 
aad the manic-depressive bipolarity was ob- 
R desired to infer constructs which 
se have descriptive merit for several dif- 
ferent samples (and analyzed several different 
samples with this purpose in mind [4, 6, 8, 
9, 16]). In view of this broad interest, it was 

feared that descriptive inferences that re- 
flected the interrelationships among clusters 
in a particular sample could have less general 
descriptive merit than inferences based on a 
rotational procedure which showed the gen- 
eral organization of symptoms and did not 
overdescribe one sample at the cost of under- 
describing another. 

4. We were not interested in the interrela- 


tionships among exactly defined clusters, Al- 
though a study of such i 


be the proper subject for 
considered more practica 
the interrelationships of 
by our completed descri 
than to be interested i 


5. Various samples were 
the general nature of the 
could be most descri 
samples were not sele 
basis for confirming t 
ships which might b 
clusters. 

Lorr comments that the re 
analyses will vary from study to study as q 
result of differences in the intercorrelated 
variables (1). It should be added that, if the 
samples of subjects which provide the data 
vary from each other with respect to the vari- 

ety and organization of their behavior, the 
results of factor analyses will vary from study 
to study despite the uniformity of the inter- 
correlated variables, Indication of this is pro- 
vided in our studies which permit a compari- 
son of old (8) and young patients (8), newly 
admitted fulminating patients (6), and 
chronic patients (4). All of these groups pro- 
vided a slightly different basis for descriptive 


inference despite the fact that the same set 
of rating scales was used. 


chosen to reveal 
inferences which 
ptively valuable, The 
cted to provide a valid 
he kinds of interrelation- 
e found among symptom 


sults of factor 
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With respect to Lorr’s interest in E E 
ference between his oblique rotation E 
Connecticut factors and other poe nee 
tions of other symptom data, the ri ag pe 
suggest that the differences might f symp- 
either to the differences in the group the atl 
toms intercorrelated or differences in T sant 
eties of symptoms which characterize rad that 
ple. In this connection, it should be no evel 
the Connecticut sample comprised re 


: ; n 
They were newly admitted patients stale FA 
the most part in the “fulminating 

] in 
schizophrenic excitement er 0 
scribing such patients (6, 19). y phrenic js 
hebephrenic (or deteriorated) schiz | 


ept 0 
their disorder. We have found the concept =- | 


peen 
5 ave 
useful in describing patients who bees proce- 
hospitalized for a long time (4). 


r ively de 
dure was not intended to be exhaust! ay 


- sample 
scriptive of patients’ disorders. ae nP afer- 
symptom rating scales emerged ts and W 2 
ences with numerous peura most in” 
considered by them to comprise : atients W 
portant symptoms for describing Tat who 2 
have severe functional disorders me 
not yet far advanced in their inp: ton 
descriptive implications of the e p p 
been extensively and intensivety 
however (7, 13, 14, 15, 17, 19). in 

This comparison serves to re petter ju 
the value of inferences can be serve 
on the basis of the purposes a they a 
on the basis of the manner in WMG ijy per os 
derived. This comment is pane proced" iy 
nent when we compare rotation? we e Ga, 
Which, regardless of their outcome, 2° pati 


TA e orig! 
adequate for the description of th 
data. 


Reply. 
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Orthogonal Versus Oblique Rotations 


Maurice Lorr 
Veterans Administration, Washington, D. C. 


In psychology, as in any scientific field, the sumption of orthogonality and a conden 
observed data can be interpreted in a number better fit of the data to the model is iana 
secured. An examination of the two- nally 
sional plots of the Wittenborn orthogo The 
- Other investiga- rotated factors supports this ee is 
Schrizophrenic Excitement reference m this 
poorly defined in most of the plots. ework 
Particular instance, the orthogonal frame 


ic 
ess for further does not Provide a satisfactory geometri 
) Consistency with already devel- to the data. 


. toe . Comm 
oped theory, simplicity, geometric fit to the Since fewer assumptions are made cture 
data, and plausibility, cerning the data and the simple ee 
achieved js better, it is plausible to S 


; cture 
On logical grounds that an oblique ane 
orthogonal or an would be more stable from sample to 


ately 
than an orthogonal structure. aguas 
there are little data to check this conject 


f Orthogonality obli 
Ores are to be ond-order 


s ntar 

k í - tions obtaining between the more elema ten- 

However, if the scores used are simply a factors, The second-order factors provi ons 
weighted combination of ratings on the Scales tative definitions of more inclusive resp? ; 


thus hypothetical since the variables are rarely intrapunitive Parameter has been Tepen 
independent statistically, In fact, the gain in identified (2 3). Others are facto awal 
understandability of the orthogonal factor inki i i 

would also seem to þe doubtful. Most investi- 


ment. If the construct validity of these E is 
are typi- further established their intrinsic a h 
a battery. greater than that of the primary facio ope 
ined greater Wittenborn’s aim was to EN 4 
since it Conforming as far as possible to pre game 


: c he 
Seems’ likely that Psychologi chiatric syndromes which were, at t 


ron 
rarely independent. O time, as few in number as possible. Tn rer ain 
work we have explicitly hypo mee atie 
- Constructs that grew out of clinical e d to tes? h 
» Simple structure is best obtained and theory. The data were then use sup | 
by oblique rotation 


ere 
whether or not these constructs W 
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Orthogonal Versus Oblique Rotations 


ported in fact, We viewed the psychiatric 
diagnostic groupings as classes of patients, 
not as syndromes, We were interested instead 
in the parameters underlying symptoms and 
deviant behavior. Establishment of distin- 
8uishable patient groups we regarded as a 
Separate problem, 
In Summary, we have given greater weight 
to that mathematical model which required 
the fewest assumptions and which yielded the 
est simple structure fit to the data. Our aini 
aS been to identify parameters of psychopa- 
ology of some generality regardless of their 
Possible use in clinical practice. It is this dif- 
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ference in goals and differences in models that 
the reader must consider in making his choice. 


Rejoinder. 
Received June 26, 1957. 
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The Social Desirability Stereotype and Some 
Measures of Psychopathology’ 


C. James Klett and Arthur S. Tamkin 


VA Hospital, Northampton, Massachusetts 


i he 
Although considerable homogeneity A ihe 
stereotype was present in all a a eee 
dichotomies formed by P and by arte other 
in judg- significantly less homogeneity than t tively)- 
ially desirable dichotomies (r = .68 and .71 e that 
und between In the sign test analysis it was Pe 
ts (r= 03), patients high on P and/or low on s: 4 less 
need Endurance items to be sigmaan 
socially desirable and need Heter Ra desit- 
). In spite of this items to be significantly more socia ri high 
€ were some sys- able than the corresponding low eal need 
SD groups. The high Es group judg to be 
chievement and need EAE and nee 
i d i Significantly mor ne | 
items. No essential differences Were found senird per yon porna ein the 
be Significantly less socially ecg the in- 
low Es Sroup. It was felt that ene 
Creased heterogeneity in the social gana differ- 
stereotype and the particular subsca t parsi- } 
ences found were accounted for mos | 


into a Psychotic and a n 
however (7 = 90). 

Tn an attempt to find some means of sepa- 
rating this patient 8toup into sub 
ing more heterogeneity in the so 
ity stereotype, 84 of the 118 pati 


J scales 
Moniously by assuming that the MMP f 
scorable MMPIs were sorted into thr, 


sis 0 
ad separated the subjects on the ag fi 
their willingness to conform to go his 
ments about what is socially desira ree Of | 
Barron Ego Strength Scale (Es), the Pathol- May or may not be related to deg | 
ogy Scale (P), and the Social Desirability Psychopathology, 
Scale (SD) from the MMPI as criteria, These i 
dichotomies were then evaluated for hetero- Sief Report, 
geneity of the stereotype by Correlation and Receiveq July 17, 1957. 


cê 
refere" 
1. Edwards, A, L, Zdwards Personal a foia 
Schedule. Manual. New York: gi 
2 i sirab ee 
2. A stability of the sonial r A 
scale values in the Edwards ar 1 
ence Schedule. J. consult. Psychol, 
Administration Hospital, Northampton, Mass., or 183-185. 
or a fee from the American Docu 


mê 
et 
irability stereotyP® jp w. 
mentation Institute. 3, Klett, C. J. The social erage a Psychol» 
Order Document No, 5362, remitting $1.75 for mi- hospital population. J. consult. 
crofilm or $2.50 for Photocopies, ii 


z An extended report of this study may be obtained 
without charge from Dr. C. James Klett, Veterans 


A 
References | 


press. 


450 


| 
| 
| 


Journal 
of C. 4 
Vol. 21, No. ae Psychology 


n additi ; 
ew base, j its utility in the charting of 
tevin analysis is an important 

the clin; ‘tric analysis, one which can 
ician with insight into what his 


an y 
00 o 
Provide h 
ests 

Ë ara measuring 

Vec} è a Previous f 

1°chsler-Bellevue factor-analytic study of the 
gee and hw Da mi gine schizo- 

ain- i 

Bene, merged cope amaged patients (2), 
: ationale for the Wechsler- 


T 
Beal ot 
test 

wetom 
Phre 


tien ue as a : 

meas, (1), In pes to neuropsychiatric pa- 

Maj ement fun lon to a description of the 
ction of each subtest, two 


the Or Conclusio 
Aan were drawn; (a) although 
me PS, some ag emerged from the three 
urement E the subtests had different 
iit ctions in the different groups, 
Yen, ute a ge ae in the literature which 
to the rder of specificity of meas- 
Of their Subtests are unjustified in the 
5 ative] relatively high communalities 
Š. Y low reliabilitic 
te ith A iabilities (1, p. 277; 2, 
te, Oe tonas 
tevja, "Bence eae of the Wechsler Adult 
the yn and cale (WAIS) (4, 8) which is a 
Perfy, echelon standardization of Form I of 
st a com ellevue, it became possible to 
aq prer = a factor analysis of nor- 
Stoy Ta been ide age-range. This was done, 
separately reported (3). The 


25 Ps 5 
` tud; 
6 34 Udied were ag 

ges 18-19 (N = 200), 


Ve = 3) 
Tr TTS ai. 45-54 (N = 300), and 
net va SBs 2). It was found that the 
Coop Yorp 3S Ae By Service, Franklin D. Roose- 
Rong tion The author oe Hospital, Montrose; 
sp Se o a eels acknowledges the 
. Leon L. Rackow, Manage", 


tag At p, OSe 

et, Die berg i 

Sen, ara sccm, ee of Professional Services, 
YMour fen of the Dean’s Commit- 
- Klebanoff, Chief, Psychology 


A Factor-A i 
-Analytically Based Ratio 
nale fi 
Wechsler Adult Intelligence Scale! ai 


a Jacob Cohen 
. Roosevelt Veterans Administrati i 
4 3 ion Hospital and New Yo i 
u rk Universit. 
y 


same major factors were i 
7 operative ove: 
— age range, and, moreover, that oe 
were essentially the same factors as had aaa 
identified for neuropsychiatric population. ae 
the Wechsler-Bellevue,* as follows: a 
ae A—Verbal Comprehension—V ocabu: 
ary richness and verbal-symbolic milas 
* “ys. m 
tive ability. gama 
Factor B—Perceptual Organization—The 
organization of nonverbal, visually perceived 
material against a time limit. 
C—Memory—This involves a re- 


Factor 
of what was previously called Free 


naming 

dom from Distractibility (1, 2), not a differs 
ent factor (3)., It involves both immediate 
memory as well as the efficiency with which 
previously learned material can be called up 


when needed. 
Two minor factors, both of which loaded 


only one subtest consistently, were found. 
These have no analogues in the previous study 
(1, 2), but their consistent appearance in this 
study argues strongly for their “reality.” Fac- 
tor D is a Picture Completion specific found 
in all four groups and Factor E is a Digit 


Symbol specific found in all but the oldest 


group. 


Finally, & second-order factor was found 


the case in (2), was interpreted 
] functioning and 


subtests. 


2 The names given th 
ition is essentially th 
are jnterpreted a 
t in the factor-ani 


arger samples and cross 


e factors differ, but their com- 
e same. Such differences as 
s being the consequence 
alytic method made 
checking among 


451 


52 Jacob 
4 
The factor analysis described in (3) ee z 
standard one: centroid extraction and o ie 
rotation to simple structure in T of the 
four groups. For the purpose of =: ae a 
rationale for the subtests, a supp emen ay 
analysis was performed. This analysis, after 
Thomson (6, pp. 189-191), provides a factor 
table in which the correlation of each sub- 
test is given with G and also with the com- 
mon factor-specifics, i.e., what is specific to 
each factor after G has been removed. Since 
G and the factor-specifics are mutually inde- 
pendent in this analysis, the square of each 
subtest-factor correlation gives the proportion 
of the variance of the subtest attributable to 
the factor in question. 
Such an analysis was 
the four age 
population. It 


performed for each of 
groups of the standardization 
was found in (3), however, 


Table 1 


Mean Correlations Between WAIS Subtests and F; actors 
of the 18-19, 25-34 and 45-54 Age Groupsa 


Subtest G 


A B C D 
Information 83 32 %3 o8 o5 00 
Comprehension 72 38 B 02 06 ~o9 
ithmetic 71 06 o5 32 04 -05 
Similarities T 27 —05 —02 15 gg 
Digit Span 62 —08 —03 29 02 i5 
Vocabula 3 39 —02 03 o o7 
Digit Symbol 65 05 12 05 —03 28 
» Completion (oe 05 GF —02 32 o 
Block Design 70 -0 41 07 02) o 
P: Arrangement 70 07 «29 Ole 08 79 
0. Assembly 6 %3 4o ôl 01 01 
Percentage of 
ariance 52.3%, 4.6% 4.7%, 21% 1.8% 15% 


Cohen 


Table 2 


i 7 btests and 
Correlations Between WAIS Su $ 
Factors of the 60-Over 75 Age Group 


Subtest G A B cC wD. 
07 
Information 73 29 —08 ? _07 
Comprehension 60 39 01 y 05 
Arithmetic 63 04 05 2 ol 
Similarities 65 42 09 a 02 
Digit Span 48 00 08 o 06 
Vocabulary 66 37 —02 y 13 
Digit Symbol 71 —06 26 at 25 
. Completion 75 07 04 a 00 
Block Design 65 00 53 ne 07 
P. Arrangement 65 30 18 me 02 
O. Assembly 58 —01 51 —03 
Percentage of = 9% 
Variance 42.1% 5.9% 61% 9.7% 0 


ey, 
r italiciz 
* Decimal points omitted. Values of .20 and greate: 


the 

transformation of + (5, pp. 132-133) pay 
three younger groups (18~19, 25-34, Gor the 
54), and Table 2 gives similar data 
aged group (60-over 75). the cor- 

The entries in the tables represent h inde- 
relation of the subtest with G and at eac 
pendent factor-specific. The auare ps vari- 
entry is the Proportion of that subies ai 
ance attributable to the factor in PE arcent- 
the foot of each column is given the po at- 
age of the total variance of the subtes 
tributable to the factor. 


Information 


jon shares 
For the younger groups, afore 

with Vocabulary the distinction of ubtests» 
best measure of G among the wee 1). Zt 
with a Correlation of .83 (see Tab Compre- 
also Consistently measures vee t lesser 
ension, but does so to a somewha rehen- 
degree than do Vocabulary and Comp 
sion, 


, tinue 
In the aged group, Tntexmatlan put, 
to be a relatively good measure of G vari” 
Consonant with the general decline orrelatio? 
ance for these subjects (Ss), its When this 
with G falls to .73 (see Table 2). f the P& 
difference is expressed in a hae to G, i 
centage of its variance attributa 53%. y 
magnitude is appreciable, 69% to jmportan® 
This subtest also illustrates the i roup 60 
of the Memory factor in the age 8 


W — -- 


Factor-Analytic Rationale for the WAIS 


one Although Information continues to 
or, it i with the Verbal Comprehension fac- 

err i i more highly correlated with the 
e actor for these aged subjects, in 
ARa a the comparison being Factor A 
Plexity of actor C 16%. The factorial com- 
ambiguit this subtest for the aged results in 
Score tg interpretation. A given subtest 
of levels Ai the result of many combinations 
a poor Tni verbal and memory ability. Thus, 
result eit ormation score for an aged S may 
ity or aed from poor life-long verbal abil- 

age decline in memory ability. 

Other ¢ ue Information measures validly 
3 emory f and Verbal Comprehension (and 
about toe, the aged), it can do so with only 
i specif Of its variance (3, Table 7). 
ation ie so low, interpretations of In- 
e Recess E lines other than those above 

I iad hazardous. 
SUbtests aty, Information among the WAIS 
or Ove Provides the all-around best measure 
Wechsler © entire age range, fully justifying 
howay, © faith in it (7, pp. 77-78). It is not, 
P Verh te 800d choice to use as a measure 
fests ya Comprehension, both because other 
MS con, P28S it in this regard and because of 

Sxity in aged Ss, 


Va 


t, 
form 


Comprehension 
or holds median rank as a meas- 
ith younger Ss (.72), but shares first 
SI Wit S k 
fasl i Vocabulary in providing the best 
orir he of the Verbal Comprehension 
pon fem 1) and measures no other 
è i 
oldest group, its loading of .60 on G 
absolutely and relatively, ranking 
“We West among the subtests (see 
agg dt, it ith regard to common factor meas- 
Suffers the same fate of ambiguity 
“ration: it responds to individual 
abi, Memory ability as well as in 
Dre n the case of Comprehension, 
a attributable to the two factors 
T ‘Wal, Factor A 15% and Factor 


Thi 
ite of Subtest 
tan 


foe specs 
Gite fae 
able 


ity of Comprehension averages 
Gy 
tapyte 


© Value as for Information, 1076 
Although it was found tha 
on the Wechsler-Bellevue may 
Variance tied to “judgment” for 


Ve hen we 
Speg: lon 
Peg} 
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schizophrenics (1, p. 273), its revised form 
in the WAIS has such low specificity for nor- 
mals, that singling it out of a record for 
unique interpretation is unjustified. 

The major utility of Comprehension, then, 
is as a measure of the Verbal Comprehension 
factor from early adulthood through middle 
age. It is not a good measure of G over the 
entire age range, and is an ambiguous meas- 
ure of A and C for old Ss. 


Arithmetic 


The picture which emerges in the analysis 
of this subtest’s measurement characteristics 
for Ss up to middle age is that it is a mediocre 
measure of G, and beyond this a “pure,” al- 
though weak, measure of the Memory factor 
(see Table 1). Since G accounts for an aver- 
age of 50% of its variance, and the Memory 
factor an average of 10%, much of the vari- 
ance is yet to be identifed. From the reli- 
ability coefficients given in the manual (8, 
p. 13), it is estimated that about 19% of 
the subtests variance 1s specificity, and this 
amount is among the largest obtained among 
the subtests. With specificity variance about 
twice as great as Memory factor varias 
judgment about Mena ability from this 

j risky. 
a aone Ames the picture changes 
Eor urement is reduced (7 = 


meas ¢ 
eee i Factor C correlation goes up 
. ? 


. Unfortunately the reli- 
too puea e subtests are not 


: i for th : 
ability coefficients so that determina- 


: this group. ui 
ara p io icity of Arithmetic is not pos- 
tion 


rtheless, since Memory variance 1s 
), one can more safely attribute 
1 


this factor in the aged group, after 


ic 
to other su 
G (compared cory ability, 


three groups and strongly for the 
Similarities 


iti tin 
«milarities subtest ” 
perms the subtests 12 


younger Ss ranks 

correlation m5 

n ee h it correlates to a ma- 

ith Factor A, its aes 

27) and rela- 

i th absolutely (. nd rela 

a ee si 1). Its average specificity is 
tively (5 


—— 2 
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17% (3, Table 7), which is higher than ra 
any of the other subtests correlating wi h 
Factor A. The combination of relatively hig 
specificity and low Verbal Comprehension 
variance makes it an untrustworthy measure 
of the latter when used by itself. . j 
On the other hand, for aged patients, it has 
measurement characteristics that recommend 
it to the clinician. It is the only verbal subtest 
which correlates with Factor A (.42) and does 
not also correlate with the Memory factor (see 
Table 2). After allowing for its G measure- 
ment (.65), it can therefore be used as a 
measure of Verbal Comprehension, uncon- 
taminated by variance in the Memory factor. 
(Its correlation with G is median among the 
subtests.) 

Tn summary, while Similarities 
Particular to commend it up thr 
age, among aged Ss it has the 
being a measure of Ve 


affected by individual 
ability, 


has nothing 
ough middle 


In the 


s, Digit Span is 
the poorest measure of G (.62), and a “pure” 


Specificity js quite 
erage of 20 


be the Poorest 

Stoup, with an 7 of 
‘48 (see Table 2). It measures the Memory 
factor purely here, too, and to almost the 


Same degree as it does G (.44). This test can 


O 
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specific factor Verbal Comprehension can 
Thus, if a single subtest is needed to na x 
Present general intellectual ae Ee aie 
verbal avenue, soni gaa be the 
test of choice for younger Ss. - 
Among the aged, the picture changes E 
rially. Firstly, Vocabulary becomes a t the 
poorer measure of G (.66), falling i (see 
median of the subtests in this regar 
Table 2), - 
A Au difference in Vocabulary’s ae 
urement functions with the aged is the ane 
familiar incursion of Memory ee 
25% of its variance (an 7 of .50) is Base (anr 
with Factor C, far more than the ke nsion 
of .37) attributable to Verbal Comp eax al- 
(see Table 2). This is the same prob ne 
ready encountered with Information ee 
hension, but raised to an even greater more 
Poor scores on this test are ree Me: 
likely to reflect poor memory abiiy n poot 
quent upon senescent deterioration) a o- 
verbal ability per se. From the score canno 
cabulary alone, however, the clinician 
make an unambiguous interpretation. ure 
Vocabulary emerges as a good m ages 
G and Verbal Comprehension until 0 of G 
When it becomes a mediocre measure Mem- 
and an ambiguous measure of both the | 


of 


ory and Verbal Comprehension factors. 


Digit Symbol our w 
With the Digit Symbol test we have ific 
first encounter With one of the minor eastres 
factors, Factor E, For younger Ss, it a its 
only this among the common f. ae oorest 
Correlation with G (.65) is among the easures | 
(see Table 1). Since only this = terpreta- 
Factor Ẹ consistently, a positive ha 
tion of this factor is not possible—al 0) 
° Said is that it is not a measure (or 
Ceptual Organization, or of memory 
Course, of verbal ability). 1 cannot bé 
he Specificity of Digit Symbo | 


f per 


hat cal a 
o: 


ver. 

high, Th Certainty, but is probably vaj. 
igh. The only reliability informa form © 
able is a range-corrected eer schoo, 
efficient for a group of female sm obtaine? 
applicants (8, pp. 12-13). If t ne the © 
value of .92 is used as an estimate roups, #2 
liability coefficient for our younger § esults 
estimated average specificity which r! 


Factor-Analytic Rationale for the WAIS 


37%, . 
ie of ug Specificity accounts for almost as 
and much nia — s variance as G (42%), 
n the ae than E (8%). 

tor E failed Ae gam as has been noted, Fac- 
ment, which Ps puran Beyond G measure- 
Ymbol test į relatively high (.71), the Digit 
to Perceptu i significantly but weakly related 
Pervasive og Organization (.26) and the 
Wg the arty factor (21). In addition 
as for the y estimated reliability coefficient 

younger group, specificity accounts 


for 
Some 2807 
Portion. /o of the variance, a large pro- 


4 Becay 
= aie estimated large specificity, 
in the EAE uninterpretable minor 
in the ARN groups, and its com- 
nif erpretatior d, scores from this test re- 
o cance, Altho n as to common factor sig- 
ta d measur ugh Digit Symbol is a fairly 
a tests € of G for aged Ss, there are bet- 
nfor See this purpose in the battery 
2), and Picture Completion, see 


Picture Completion 


© aver _ 
higtPletion in r elaton with G for Picture 
est value the younger groups is .75, the 
Sts in ina among the performance 
tor D. meara aes groups (see Table J). 
wi 10% the other specific minor fac- 
aia its variance, and no other 
Mtep in spe tonal 15% of its variance 1S 
lagi Stable aes . Since Factor D is un- 
2 non Stee has no utility for common 
tes Verba] ement, but in situations where 
; Fi €asure of G from a single sub- 
at Aes evidence of the present re- 
bee Sa ete the best choice. 
est ME Gco er 75 group, this subtest has 
t Single : Trelation (.75), but here it is the 
( test for measuring G in the en- 
Ta Table 2). As is true in the 
» it measures Factor D purely, 


us 0 
actor-analytic research with the 
that 


elle 
omplen (2), it was reported 
ion was a complex test, meas- 


tri a c 
Oth 
actors A and B for patients ( 


d 
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pp. 275-276). This seems now to be an error. 
a consequence of failure to extract enough 
factors, due in turn to a low N. In extractin 

two additional factors in (3), the ambigui 
due to the test’s apparent complexity in LEIRA, 
factors was removed, unfortunately only to be 
replaced by ambiguity resulting from a load- 
ing in an unknown factor. A parallel situation 
exists for the Digit Symbol test (1, pp. 276- 


277). 
Block Design 


The measurement characteristics of Block 
Design are quite straight-forward and similar 
for younger and aged patients. It falls at 
about the median with regard to strength of 
correlation with G, with values of .70 for the 
younger three groups and .65 for the aged 
group (see Tables 1 and 2). It is the first 
subtest thus far encountered whose common 
factor variance is completely in the Percep- 
tual Organization factor. The correlations with 
this factor, moreover, are relatively substan- 
tial: 41 is the mean value for the younger 
groups (Table 1) and .53 for the aged group 

le 2). 
k a N value averages 179, which 
is above the median of the subtests, but not 
objectionably large (3, Table is = 

The Block Design subtest, then, is useful 
for the purpose of measuring Perceptual Or- 
ganization over the entire age range. 

Picture Arrangement 
this subtest is at about 
Jation with G (.70). 
its G measurement, although it cor- 
on Perceptual Organization, it 
re weakly, with an average value of 
_ This amounts to only 
egligible amount. This 
es only 8% (3, 
variance is taken 


For the younger Ss, 
the median jn its corre 


put much ot 1 

a s r, its reliabili 
eing .66, .60, and 
n worse When one 
‘able 2). Its G 
still median, but poorer in 
. With regard to common 
the correlation with Fac- 
(.18), and it correlates 


measu 


absolute 
factor measurement, 
tor B is even lower 
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-30 with the Verbal Comprehension factor, at 
about the same level as Information does. 
The factorial complexity of Picture Arrange- 
ment in these older Ss raises the concomitant 
problem of ambiguity in the interpretation of 
its scores. 

To sum up the case of Picture Arrangement, 
it is a mediocre measure of G taken by itself, 
and a very weak or ambiguous measure of the 
common factors. 


Object Assembly 


The final subtest of the WAIS batt 
ject Assembly, correlates with G 
younger groups and .58 in the aged 
each second only to Digit Span in being the 
Poorest subtest in this regard (Tables 1 and 
2). As was the case with Block Desi 


ery, Ob- 


group, in 


pore b ae in the younger three 

oldest group (Table 2). 
Its specific variance js 

subtests, averaging 5% 


(computed from relia- 
» P. 13), 
_ Despite this, this test can be used effec. 
tively to measure Perceptual Organization 
Particularly when used as described below, By 


Factor Scores 
i The conventional scoring of Wechsler scales 
ion, by appropriate 


, Of Verbal, P - 
ance, and Full Scale IQs, The padi eo 


Jacob Cohen 


sponds to Perceptual Organization but also to 
the minor specific factors D and E. 

It is possible, on the basis of the above 
analysis, to replace the a priori verbal- 
performance subtest groupings by the a 
tional unities actually in operation. It is sug- 
gested that scores for the three major factors 
described below may be found useful in ari 
cal, educational, and vocational research m 
subsequently, appraisal. Their estimation 7 
averaging appropriate groups of subte : 
rather than from single tests results in com 
siderably increased reliability. 5 

In the material which follows, it is neci 
sary to convert raw scores to weighted seo 
for the purpose of achieving subtest compan 
bility. However, the weighted scores reari ia 
are not those normally found for the po 
of IQ computation, where the reference 810 3 
was made up of standardization groups the 
tween the ages of 20 and 34. Instead, Oy 
weighted score conversion tables deter” 
Separately for each age group as given A op 
appendix in the manual are required (8, are 
99-110). Since the findings of the study eous 
based on Correlations of groups homoge ae 
with regard to age, it is necessary that 
Corrected weighted scores be utilized. 

Since all the subtests are correlated at re 
appreciably with G, the G factor sr ecores 
found by simply averaging the weighte pee? 
of all the subtests (or all which have ther) 
given). A score of 10 on this (and all © g at 
factor scores indicates average function!” is 
that age level. If only a measure O! ves 
needed, the Full Scale IQ, of course, © 1 
this Purpose. The mean subtest We oint 
Score, however, provides a reference P 
Comparable with the other factor ook all 
Verbal Comprehension factor scores, yeras" 
but the oldest patients, are gotten by is ities 
ing Information, Comprehension, eA Com” 
and Vocabulary weighted scores. vee of 
prehension ability thus measured is "P pove 
“low” relative to the population as 1 


Jeast 


b 
tis ae 


sysativ! 
or below 10, and “high” or “low” F iities) 


(ie., within the pattern of the 9’s abi ese 

as it departs from the S’s G factor eee 

relationships hold for all the factor sco” pare 
The approach given in the preceding sint? 


graph cannot be used in aged patients, 


Ta 


B 


Factor-Analytic Rationale for the WAIS 


Teflect dine noted, the resulting score would 
hens as much Memory as Verbal 
on RA variance. Instead, Verbal 
re AS factor scores for patients past 
or Silti averaging the weighted scores 
ese being i and Picture Arrangement, 
Which load. e only tests loading this factor 
erceptu a other (besides G). 
obtained “a Organization factor scores are 
Object ee averaging the Block Design and 
Younge mbly weighted scores for both 
Te er 
me way no factor is also measured in the 
4 averaging y the entire age range, namely, 
Nd Digit ll aaa scores for Arithmetic 
Possible apy 
Se eterio eee to the measurement of 
t oe ion is suggested by the incursion 
È an older w factor into the verbal sphere. 
eer than rt Memory factor score is much 
s ore, and his s Verbal Comprehension factor 
are varianc average score on the tests which 
ae i from both these factors, Infor- 
Ose to h; prehension, and Vocabulary: is 
Suggests 1p reduced M thi 
he Sts that th emory factor score, this 
a 4 reduced ie verbal subtest scores have 
Judg y failing memory ability, and 


de tation Summary 

a wed eet the subtests of the WAIS 

Rach age Aaa analyses of this test for 

Subtest’. nge (3), has been presented. 
Measurement function, in terms 
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of a dominant general factor j 
common factors (Verbal A Na eae 
ceptual Organization, Memory), and two mi a 
factors was presented. The results in the E 
younger groups (18-19, 25-34, 45-54) Sir 
found to be quite similar, but some subtests 
undergo a change in measurement function in 
the oldest group (60-over 75). Specificities 
are again (as in 1) not found high enough to 
warrant unique interpretations of the subtests 
Methods were presented for the determination 
of factor scores which may prove useful in the 


appraisal of intelligence. 
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Intellectual Ability and Mode of Perception’ 


Douglas N. Jackson 


Pennsylvania State University 


Witkin and his colleagues (5) have re- 
ported intercorrelations among several labora- 
tory measures of orientation to the upright, 
and have shown that the latter correlate 
rather highly with a specially constructed 
embedded-figures test (4), in which S is in- 
structed to locate a simple figure contained 


in a more complex, colored pattern. The au- 
thors also present 


relationships betwee 
entation tasks and 
test and scores derived from a Variety of clinj- 
cal assessment devices, 

Because the 
visual field-ind 


escribed as 
more active, self-aware self: 


, Sell-assured, and gen- 
erally “mature”), it was deemed advisable to 


ligence in a study 


of relationships among perceptual and person- 


ality variables (2). 
The Witkin embedded-f, 
administered to 4 


Exami- 
nation (ACE) scores were also available, The 


product-moment correlation between the two 


1An extended report of this stud 


tained without charge from D 


re- 
tions may have been attenuated by ne 
Stricted range of intelligence in the 
opulation, i nd 
: This relationship between intelleges te 
perception in young adults parallels Pi 
nary data obtained with children in t results 
Laboratory (5, P. 478). The ah rstone’s 
are not surprising, considering oe cau 
(3) findings, and should add a no hie þe- 
tion to attempts to interpret selon n an 
tween individual differences in perehi for in- 
personality without adequate contro 
tellectual differences, historical 
However, in a broader sense, the need no 
Priority of the concept of a may 
imply its Conceptual preeminence; vetoes and 
fruitful to consider intellectual abi 


re 
: : of mo 
perceptual mode as manifestations style 


> lit 
Seneral dimensions of ptm study 
Clearly, there is a need for ee (1) in 10° 
of the various intellectual abilitie 
lation to mode of perception. 
Brief Report, 
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An Evaluation of Eclectically Oriented Psychotherapy 


Frederick C. Thorne 


Brandon, Vermont 


The Purpose of this study was to evaluate 
the results of eclectically oriented psychother- 
apy utilizing a wide variety of methods (3) 
Specifically according to indications of time 
and place. The most rigorous test of the effi- 
cacy of any therapy can be made only by 
Using cases of established severity which have 
Proved refractory to other treatment methods, 
and in this study the most chronic and malig- 
Nant cases available were deliberately se- 
ected. Much previous research has been 
Weakened by failure to establish the exact 
nature and severity of the pathological proc- 
esses presumed to be under treatment. This 
as ect was controlled in the present study by 
Be: of severely maladjusted subjects inde- 
a ndently studied and diagnosed before re- 
A for therapy, so that the nature and de- 
lishe of pathogenicity was objectively estab- 
oe by other specialists. Another common 
Spo arch defect is the failure to control for 
iate neous remissions and the undifferen- 
ac ed effects of miscellaneous therapeutic 

Ors operating in all treatment situations. 
uch factors have been partially controlled in 
1S study by selecting chronic cases which 

e shown no improvement or even become 
r during prior institutional or outpatient 
ap ment so that their refractoriness to ite 
ster had been demonstrated. Finally, mentai 
she at the start and end of therapy was 
(Pq) objectively using a Prognostic Index 

) Score (1) measuring the five factors of 
“ignancy of symptoms, trend of disorder, 

an ncity, degree of social and economic in- 
Pacitation, and subjective feelings or status. 


Description of Cases 


i pe following criteria were used for select- 
Cases: (a) Each case was diagnosed as to 


nature and degree of disorder by an independ- 
ent expert, usually the referring specialist. (d) 
Only cases having an initial PI score above 15 
were selected, thus establishing the fact of 
markedly incapacitating disorder. (c) All 
cases showed definite refractoriness to therapy 
as evidenced by chronicity even though hay- 
ing been exposed to prior therapies ranging 
from hospitalization to psychoanalysis for 
long periods. (d) Only cases motivated to ac- 
cept therapy and cooperative enough to con- 
tinue for at least 10 one-hour interviews and 
judged by the therapist as having sufficient 
personality resources to offer some hope for 
reorganization of personality integration in 
depth were used. 

The Ss consisted of the first 50 cases from 
the files in the years 1953-1956 which met 
these selection criteria. There were 20 males 
and 30 females, mean age 32.86 years (range, 
16 to 63 years), with average education of 
13.08 years (range, 8 to 20 years). Twenty- 
nine were treated while guests at Spring Lake 
Ranch, Cuttingsville, Vt., a psychiatric half- 
way house accepting referrals only from psy- 
chiatrists. Most of them either had been in 
psychiatric sanitaria or were on the borderline 
of institutionalization. All were ambulatory 
cases with behavior disorders or eccentricities 
so severe that they could not remain at home 
or in the community; many were considered 
unimproved even after years of treatment 
elsewhere. Twenty-one cases were referred by 
psychiatrists or physicians; all previously re- 
ceived treatment ranging from psychotherapy 
to tranquilizing drugs. Twenty-five. Ss had 
been in psychiatric hospitals for periods rang- 
ing from one to 32 months (mean 12.5 
months), and 9 more had outpatient therapy 
lasting from six months to 21 months (mean 
15.6 months). Methods of treatment had in- 
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cluded psychotherapy, psychoanalysis, electro- 
shock, insulin coma, and chemotherapy. whe 
Table 1 presents the pretherapy psychiatric 
diagnoses made by referring specialists. Many 
cases presented mixed symptomatology, and 
several had been given different diagnoses re- 
flecting varying status. The final posttherapy 
diagnosis differed from the initial diagnosis in 
7 cases, in 5 of which there was a differential 
diagnostic question whether alcoholism with 
psychopathic character disorders was primary 
or secondary to complicating Psychoneurotic 
reactions or prepsychotic state, During psy- 
chotherapy, the diagnostic process was ampli- 
fied constantly by the addition of dynamic 
insights upon which the moment-to-moment 
selection of treatment methods was based, 


Rating Scales 


Table 1 
Diagnostic Classification 


of Case Mater} 
Independent ateriaa by 


Experts 


Category 


N 
Schizophrenic reactions 


Paranoid type 
Catatonic type 


5 

2 
Manic-depressive reactions 5 
Involutional states 2 
Chronic brain syndromes 


With convulsive disorder 
Postencephalitic 


Ne 


Psychoneurotic reactions 
Immaturity-dependency 
Anxiety hysteria 
Anxiety tension states 
Depressive reactions 
Character disorder with alcoholism 
Alcpholism with other addictions 
Psychosomatic disorders 
Mixed types, prepsychotic 


e RN WWW wp wa 


Homosexual with alcoholism 


| 


n 
© 
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Table 2 


Sources of Rating Data for Prognostic Index Scores 


Rating Scales Pretherapy Posttherapy 


Malignancy of 


Referring specialist, Therapist, staff, 
symptoms 


usually consultants 
psychiatrist 
Trend of dis- 


Case record and 
order 


other informants 


Chronicity (Pre) Case record and 


and Prognosis other informants 
(Post) 


Incapacitation 


Therapist, staff 


Therapist 


Case record and 
other informants 


Staff, friends, 
relatives, 
employers 

Subjective 


Subjects 
status 


Subjects 


malignancy of symptoms ranging from simple 
ehavior disorder to severe psychotic reac- 
tions was used to establish degree of patho- 
genicity on Scale 1. The rating on Scale 2 0n 
the trend of the disorder gave evidence a 
cerning improving, unchanging, or pay 
status, Chronicity was rated objectively p 
Scale 3 from the case history. Social and eco 
“omic incapacitation as defined in the Ge 
erans Administration psychiatric examinatlo fA 
Standards was rated in Scale 4. Subjective 
Status was determined by the Ss’ reports co 
cerning how they felt, usually with confirma- 
tory evidence from observations and staff 1° 
Ports, on Scale Ds Pl 
© sources used to gather data for ; z 
diaes are shown in Table 2. The PI ranm 
ensins on the five scales are intended r- 
Sample al] types of evidence, including refe 
mE Specialist, informants as to case na 
therapist, Staff, friends and relatives, and ies 
S themselves, Given adequate case histor 
and staff reports which routinely covered. s 
desired topics, PI ratings can be transei 
from case records by clerical assistants Me 5 
final checking by the responsible theraP 
Tadations on each scale rated from 1 a a 
reflect increasing degrees of severity, wit os 
Summated score of 25 representing the are 
Severe and malignant disorder, rapidly rae 
gressive, of long chronicity, totally incaP pi 
tating, and subjectively unbearable. 
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total score of 15 was arbitrarily selected as 
the cutting point below which cases were not 
accepted for this study, thus insuring that all 
Cases were rated as at least moderately severe 
and more than 50% incapacitating. Scores be- 
tween 6 and 10 indicate minimal severity, per- 
oie annoying but not incapacitating, and a 
on of 5 (the lowest possible rating if all 
it cS. are scored) indicates essential normal- 
Y Within the ordinary wear-and-tear or de- 
€tloration of life, 
e range of PI scores of the 50 cases at 
ig Start of therapy was from 16 to 24 (mean 
Race Indicating moderately severe to inca- 
HA “ain disorder. Clinically, if these cases 
treated = any worse, they could not have been 
Patien at Spring Lake Ranch or on an out- 
SA t basis. The mental status of each case 
such recorded in an index of five numbers, 
sivel as 54254, in which each figure succes- 
Scales refers to the degree of severity on 
Yield 1-5, respectively. This index not only 
S a total score but permits progressive 
ings isons of the five dimensions from rat- 
ans different dates. 
eee Were rated again at the end of 
Chronici with a modifed PI in which the 
°F bro uty scale was replaced by an estimate 
Whet ee With ratings by the therapist of 
tained ir proved status would be main- 
Mies he PI ratings at the end of therapy, 
form, ete made by the same method and 
chroni E for reporting except that future 
Chronicite Was predicted instead of past 
Scores ity on the third scale. The range of PI 
23 > at the end of therapy was from 5 to 
Mean 11.13), 


Results 


at i 3 shows the distribution of PI scores 
Seg Start and end of therapy. Of the 50 
ayy,’ 21 of whom were socially and economi- 
Vere Ncapacitated at the start of therapy, 5 
Unchanged or worse in that both PI 
With . “ere above 16. Nineteen cases finished 
Modep res between 11 and 15, still showing 
able tte disability with disturbing but bear- 
these o MPlaints and marginal eccentricities; 
tateq VS Were considered as partially rehabili- 
Pie, h hope for some productivity under 
Caseg . SUPervision and therapy. Twenty-three 
°red from 6 to 10 on their final rating, 
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with only minimal residuals and were con- 
sidered functionally normal. Three cases had 
final scores of 5 and considered themselves 
fully cured. 

A further check on the validity of terminal 
PI ratings was made in terms of a follow-up 
study of actual status as of February 1, 1957, 
Thirty-two cases are gainfully employed or, 
if women, returned to family life with ade- 
quate adjustment. Nine cases are not work- 
ing; 4 are in institutions, 1 about to leave on 
trial; and 1 each, whereabouts unknown, 
eloped, or dead. Also, as of this date, 33 con- 
sider themselves in need of no further inten- 
sive therapy, 11 continue to seek interviews 
intermittently, and 6 have withdrawn from 
therapy against advice. 


Dynamics of Therapeutic Results 


All of the diagnostic and therapeutic meth- 
ods outlined elsewhere (2, 3) were utilized at 
some point in the handling of some of these 
cases. Depth modification of personality or- 
ganization was always the primary therapeu- 
tic objective while at the same time giving 
detailed attention to secondary objectives 
such as symptom relief. This approach is so 
complex that existing methods are incapable 
of objectifying and quantifying what takes 
place. Unfortunately, resources were not avail- 
able for pre- and posttherapy psychometric, 
sociometric, and projective analysis of enough 
cases to warrant detailed statistical treatment. 

The average PI rating for malignancy of 
symptoms on Scale 1 at the start of therapy 
was 4.34, and at the end, 2.40, on a five-point 


Table 3 


Distribution of Prognostic Index Scores at Start 
and End of Therapy (N = 50) 


Score Intervals Pretherapy Posttherapy 

24-25 4 

22-23 14 3 

20-21 20 

18-19 10 

16-17 i 2 a2 

14-15 4 

12-13 12 

10-11 7 
8-9 14 
6-7 5 
5 3 
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start, crying constantly, petulant, annoying eee 
and completely demoralized. Elsewhere consi i 

unsuited for psychotherapy because of organic state. 
Treatment began nondirectively, repeated Lap nce 
and catharsis necessary. Reassured by much personal 
attention, tutoring and assurance that she would not 
be sent away. Self concept altered to perceive herself 
more realistically, no longer trying to compete with 
former equals, but accepting new role of retiring but 
useful elder citizen. Former untenable goals dropped 
with new philosophy of acceptance emphasizing emo- 
tional calm and stability. After several months of 
psychotherapy, oculogyric crises formerly occurring 
several times weekly now ceased; Parkinsonian symp- 
toms improved, and mental functioning better. Now 
a healthy respected group member. Case cited to 
indicate how remaining personality resources may be 
maximized in organic conditions, 


Discussion 


Intensive psychotherapy involves the most 
difficult kind of work, both by therapist and 
client, to uncover and recondition deep patho- 
logical processes. In our experience, one of the 
commonest therapeutic errors is to abandon 
methods before they have had sufficient op- 
portunity to achieve results. Table 3 suggests 
that all methods have their successes and fail- 
ures, and it is our practice to keep working 
and trying different approaches until some 
leverage is achieved. In our opinion, even such 
older, largely discredited, methods as persua- 
sion and suggestion may be effective if pur- 
sued diligently enough. In this series, clients 
routinely were given maximum opportunity to 
solve problems nondirectively, with more ac- 
tive directive methods being introduced only 
where the client was unable to Progress alone. 
In most cases, the therapist functioned as a 
co-worker, helping the client gain insights, 
working patiently together to practice new 
modes of behavior, and consistently attempt- 
ing to operate in a friendly informal atmos- 
phere with most sessions having the quality 
of friendly conversations or bull Sessions. Cli- 

ts were always told not to accept interpre- 
enn ns unless they fit, to criticize freely, and 

apm ect that progress of therapy would not 
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be steady at all times. Above all, they wa 
taught to remain in the situation and ne 
not giving up but to continue coping wit Re 
havior even through painful periods when ne 
were frankly told that they might ete 
worse before new gains could be consolida 


Summary 


A group of 50 cases presenting severe P 
havior disorders which had proven ene 
to other therapies was selected by inte 
criteria for evaluation of the effects 0 
sive eclectically oriented psychotherapy 
cases were diagnosed by independent S 
ists prior to therapy, manifested E not im- 
severe to incapacitating disorder, ha d in the 
proved sufficiently to become ane ranging 
community under prior treatmer sis, an 
from hospitalization to beige yi ufficient 
were cooperative enough and wit ` pother- 
resources to participate in deep PA therapy 
apy. Mental status before and afte 


es 
: f ic Index sco™ 
was objectified with Prognostic E al 
. a al 
y inc KEE 


cial- 


tated at the start, 6% ; 
as totally cured after therapy, 4 


siduals, 38% showed marginal ie and iB 
with some reduction of symptom anged 
Capacitation, and 10% were un tically R 
worse. It is concluded that eclo jmpro 
ented psychotherapy is capable © mptom 
Personality integration at both = cases: 
and depth levels in selected sever 
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tin Sutes about the relative merits of objec- 
lon e projective personality measures have 
aa een a part of the area of psychological 
e aent. The validity of objective person- 
and Ta has been repeatedly challenged (4), 
| a e dissatisfaction and general lack of 
| ay dence of psychologists in these measures 

oe expressed frequently (9). Projec- 
vices, on the other hand, have often 


Bet eats to criticism because of their lack 

tivity na, norms, and scoring objec- 

bie Ogether with the inevitable accompani- 
of low reliability. 


eq erection that efforts to resolve these 
attem N taken is represented by various 
T S S to “objectify” projective techniques. 
Main endeavors have been concerned, in the 
ive > either with the development of objec- 
12) a nhrative scoring systems (1, 2, 3, 10, 
tained a applied to the “free” responses ob- 
form PTEE the administration of the original 
the As the projective test, or with altering 
5 quip, ™ of the original projective test by 
tYpe Ping the test cards or items with some 
ftom of multiple-choice or “yes-no” response 
liem ach the subject must select one of the 
Ven a aves provided, instead of giving an in- 
8). pe Tesponse in the usual manner (5, 6, 7, 
ted at methods of objectification are di- 
| Thus, toward fulfilling the need for what 
Si ean (14) described as tests which are 
Score: Ive for the subject, objective for the 
- The question arises, however, as to just 


1 
Opin; 

ate ptions or conclusions contained in this paper 
attuc Se of the authors. They are not to be con- 
Orso i Necessarily reflecting the view Or the en- 
r a of the Navy Department. A form of this 
ava] @2S Published as a research report of the 
Now ool of Aviation Medicine. 

at Purdue University. 


The Effect of Varying Degrees of Projection 
on Test Scores’ 


Edward J. Wallon* and Wilse B. Webb 
U. S. Naval School of Aviation Medicine 


on much or what, if anything, is “lost” or 
i ered” in the process of converting a pro- 
jective technique into an objective equivalent. 
This paper is directed toward an examination 
of this question. 

It was proposed that given projective tests 
be administered according to the usual meth- 
ods, that corresponding objective forms of 
these tests be devised and administered, and 
a comparison made between the scoring pat- 
terns obtained for each form. It was further 
proposed that an intermediate “projective- 
objective” test procedure be introduced in 
which the subjects (Ss) first would respond to 
the projective test in the normal manner fol- 
lowing which they would match their projec- 
tive productions with those of the multiple- 
choice alternatives constructed for the objec- 
tive form of the test. It was hypothesized that 
such a partial retention of the projective as- 
pect of the original test technique would in- 
crease the fidelity of the objective answers 
thereby obtained, permitting closer agreement 
with the responses commonly elicited by the 
projective form. 

The two projective tests selected for use in 
the investigation of these phenomena were the 
Rosenzweig Picture-Frustration Test and the 


Sentence Completion Test. 
Rosenzweig Picture-Frustration Test 


The Rosenzweig Picture-Frustration Test is 
a projective technique designed to measure re- 
actions to frustration. It consists of a series of 
24 cartoons depicting frustrating situations. In 
each situation, one of the persons is remark- 
ing on the frustrating occurrence, and a blank 
caption box is provided for the other person 
involved, into which the subject writes a reply 
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to the comment made. These responses are 
then scored as to the direction of aggression 
and type of reaction. The directions of aggres- 
sion are: (a) Extrapunitive-hostility is di- 
rected toward some person or thing in the 
environment, (b) Intropunitive-hostility is di- 
rected toward the self, and (c) Impunitive- 
hostility is minimized or denied. These cate- 
gories are designated by the symbols Æ, Z, and 
M. The types of reactions are Obstacle-Domi- 
nant, Ego-Defensive, and Need-Persistive. The 
present study, however, was concerned only 
with investigating the direction of aggression. 
The final score is determined by summing the 
number of E, J, and M responses. Since each 
item receives a credit of one point, the total 
final score equals 24. When responses combine 
both intropunitive and extrapunitive aspects 
of aggression or a partly impunitive and partly 
aggressive reaction, the score for the given 
item is split, each aspect receiving a credit of 
0.5, although the total score for the item still 
remains 1. Occasionally, responses are encoun- 
tered which are too brief to be scored or which 
are the result of an inappropriate interpreta- 
tion of the situation. The number of such un- 
scorable responses, however, is small, only two 
per cent of the present sample falling into this 
group. 
; An objective form of the test was devised 
in which the § received, along with the Ro- 
senzweig cartoon booklets, a series of multiple- 
choice responses to each situation. Thus, in- 
stead of writing his “free” response into the 
caption box, the S was directed to select one 
of the three multiple-choice items for each 
situation. Each series of multiple-choice re- 
sponses contained one Extrapunitive, one In- 
tropunitive, and one Impunitive response se- 
lected from among typical scoring examples 
provided for each item in the Rosenzweig 
scoring manual (13). 

The projective form was given to 88 naval 
aviation cadets in their first week of preflight 
indoctrination. As soon as the Ss had com- 
pleted their projective protocols, they were 
presented with the multiple-choice objective 
form and instructed as follows: 


i listed 24 situations to 

this sheet of paper are 4 i 

sachs you have been asked to provide replies for 
a of the characters portrayed. Each statement de- 


bat ng a particular situation is followed by three 
g 


scribi 
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possible replies. You are to compare the answers you 
have given to each situation in the booklet with the 
suggested responses below and indicate which of the 
three alternatives your answer most closely agrees 
with or resembles in meaning. In those cases where 
your response does not resemble any of the multiple ' 
choice responses provided, you are to mark the al- 
ternative listed which is least unlike your response. 


In essence, the Ss were asked to score he 
responses as Extrapunitive, Intropunitive, es 
Impunitive by comparing and matching acu 
with examples of each category. Two score? 
were then available from this administrate” 
projective scores and projective-objectiv 
scores. s i 

A second administration using the obje 
multiple-choice version only was given res 
second group of 71 cadets in preflight 10 
trination. 

The third administration of the t 
presented under essentially a set tO 
third group of 70 cadets was given 
tiple-choice Rosenzweig and instructe' 
lect the one answer to each item whic whi 
thought was “best,” that is, “the aps far 
would put you in the best light 45 asure 
others are concerned.” In this way 2 g was 
of the social acceptability of the ite 
obtained. i 


ctive 


est was 


the mul- 
i 


Results 


ored 
The projective form of the test har H Ro- 
by the usual procedure outlined 1? g of TE 
senzweig scoring manual. Since is only 
action were not included in this stuC7) 
the E, I, and M scores were consi oe gorie 
mean percentage scores of these © 

were compared with the Rosenzwe ce 
(13). There were no significant differ 


any category. tained ° 
Table 1 presents the means ope me 7 

each of the three categories for ea¢ 2 
tration of the test. ; 
corms 
Mean E, I, and M Scores for All Test Ferm 

ae 

Tyve 


Table 1 


Intra- 


Extra- Sa pu 
Forms punitive punitive 
Projective 10.65 aa 11 
Projective-Objective 6.36 Po 1 
Objective 3.03 9. 64 13 
Best Answer 83 9. 


r Table 2 
‘ab! n 
le of ? Ratios Obtained Between the Various 
Methods of Presentation 


A 
| Projective- “Best” 


* Sig oa 
Nifieg 
icant at the 001 level 
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3 
M scores obtained on the projecti 
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s 
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cussion 


S 7 
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Mor form doojective potential for a P 
De Close] creased, scores were yield 
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obj > result the faked version of the © 
cti alts obtained with the Pro) 
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Objective Objective answers 
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Project; 
ens 16.92" 17.72"  29.05* 
fective o oective 682*  14.11* 
Intr i 6.25* 
punitive: 
Broren 
Jective i 
rojective a 1.26 7.65 9.85* 
Objective o iective 796*  10.17* 
99 
Topunitiye s 
Project 
ject o 
Projectivs 20.65*  14.06* 20.56" 
Object C Objective 452" 
Jective ve Ad 52 
4.28* 


Table a ; 
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TOject; 
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thete wert Ve and projective-objective forms 
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om 13 significant item shifts away 
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projective test than do purely objective trans- 
formations. On the projective-objective ad- 
ministration, two of the three categories under 
study in this esperiment, Extrapunitiveness 
and Intropunitiveness showed mean scores 
which were significantly closer to the projec- 
tive mean scores than did the objective test. 
Thus, the effect of requiring the individual to 
score his own responses (by matching them 
with examples of the scoring categories), once 
he has already committed himself on the pro- 
jective form of the test, serves to limit the 
tendency to select as many acceptable re- 
sponses as he would ordinarily if presented 
with the objective form only. Such projective- 
objective procedures, if adequately developed, 
might prove useful in bridging the gap be- 
tween projective and objective test methodol- 

tting at least a partial retention 


ogy by permi 
of the desired features of both. 


Sentence Completion Test 


The second test employed was a er 
mpletion test in use at the US: Nav 
o f Aviation Medicine. An objective 
s sentence completion test, devel- 


he School of Aviation Medi- 


ble for comparison with 
The multiple-choice al- 
he objective form had 
a group of naval 
ctive form, cate- 
ding to content, 
em on the basis of 

and social adjust- 


tive form. 
on the í 
first having 
te the proje 
cor 


‘cated lack 0 adjustment, Er 
de al » Several examples 
ne d 


others: 
may fall short 


get my wings aS anyone else here. 
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The final form of the completed multiple- 
choice test (designated M-1) consisted of 47 
items of which only 32 were keyed for per- 
sonal social adequacy. The remaining 15 were 
buffer items relating to social attitudes and 
were not tabulated in the score. Every malad- 
justed response received a score of one point, 
each nonmaladjusted response a score of 0. 
The possible range of scores was from 0 to 32. 

In the present study, a fourth choice was 
added to each set of multiple-choice items, a 
“None of these” category which gave the § the 
option of avoiding matching a response if he 
felt it was not comparable to any of the 
choices offered. 

In the first administration of the test, a 
group of 92 naval aviation cadets in their first 
week of preflight indoctrination was given the 
original projective form of the test and in- 
structed to “Finish each sentence in any way 
you like as long as the completed sentences 
express what you actually feel or do.” Imme- 
diately following the completion of the projec- 
tive form, the multiple-choice form of the sen- 
tence completion test was given to the cadets 
with the following instructions: 


Below is the list of 
have just completed. 
three possible ways 


e A y of the suggested 
answers, indicate this by answering choice “Dp,” 


The second administration consisted of giv- 
ing a second group of 103 cadets the objective 
form of the test only. 

The third administration was given under 
a set to fake. The multiple-choice sentence 
completion test was given to a third group of 
70 cadets representative of the total sample 
under study. The group was asked to select 
the “worst” response for each item, that is, 
the one which “you think would make you 
appear mentally or emotionally disturbed or 

put you jn the worst light as far as others are 
concerned.” In this manner, the order of ac- 
ceptability of the responses to the'group could 
be determined. This group did nae have the 
option of using the “None of these” category. 
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For each of the 32 items, the multiple-choice 
alternative selected most frequently as the 
“worst” was used as the “maladjusted” re- 
sponse. A scoring key was devised, based on 
these cadet-selected “maladjusted” items. All 
three forms of the test were scored with this 
key. Each maladjusted response selected by 
the S received a score of “1”, each nonmalad- 
justed response received a score of “0”. ae 
there was no suitable quantitative metho 
available for scoring the projective form o 
the M-1 sentence completion test, it was na 
possible to include this form in the analys! 
of results. 


Results 


It was found that on the projective-oPire 
tive form, 36.2 per cent of the total eee 
fell into the unscorable “None of these tion, 
gory, while in the objective administra 
only 13.0 per cent fell into that catego, of 
view of this large disproportion in ey not be 
scorable categories, a comparison cou stead, 
made between raw scores obtained. E indi- 
Percentage scores were obtained for €a adjusted 
vidual, based on the number of mala r f 
responses chosen relative to the nu me of 
scorable responses (other than No 
these”), i 

The means of the scores obtaine rn 
three administrations included in this 4 
are presented in Table 3. , es was 

The significance of the difference st All 
tested by the Mann-Whitney “U ifferent 
three of the scores were significantly a nifi- 
from each other at the .001 level of 5'8 
cance, 

Chi-square comparisons of the p 
of responses to each of the ame: 
Categories were performed betwee? ersions 
jective-objective and objective test jelde 
for all 32 items. Fourteen of the items Y 


p the 
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had been given on the projective form with 
the multiple-choice version. Matchings by 
psychologically trained persons were not pos- 
sible with this test because of limitations of 
personnel and time available. 


Procedure 


The projective responses of the 88 cadets 


taking the Rosenzweig Picture-Frustration 
Test were given; along with the multiple- 
choice form of the test, to a group of 88 peers 
(designated as the “Other Cadet” group) who 


were instructed as follows: 
e 24 situations shown in the test 


booklets. Each statement describing a particular situ- 
ation is followed by three possible replies. In each 
you are to match the answers which have al- 

i klets with the mul- 
lect the one which 


agrees wil 
i ich the given answer 


Below are listed th 


ing, In those cases 1m whi 
me D at all with any of the responses below, 

-< Jeast unlike the answer 
r selection of answers by 


in the bookle meeten Ai 


filling in betw 
B, and C on y 


een thi 
our IBM sheet. 


jective forms of the test 


of six research 
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me ee of th yiation Psychology Lab- 
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Table 4 


A Comparison of Mean Scores for Categories for All 
Administrations and Methods of Scoring 


Test forms and Extra- Intro- Im- 
scoring procedures punitive punitive punitive 

Projective scores 10.65 6.31 6.51 
Psychologists’ scores 8.52 3.84 11.45 
Other Cadets’ scores 1.33 5.44 11.15 
Self scores 6.36 6.00 11.57 
Objective scores 3.03 9.20 11.77 
Best Answer scores 83 9.64 13.54 


the matchings were done by Psychologists and 
when they were performed by the two cadet 
groups is shown in Table 5, 

The Pearson + correlations between the 
groups for all three categories are shown in 
Table 6. 

On the sentence completion test, the mean 


percentage score for maladjusted 


responses 
was 29.13 for the 


means was determined by the “sign” test (12 
The value yielded - g 


under the .02 level, 


Table 5 


Table of £ Ratios for E, I, and M Among Scorers 
(N = 88) 


Other Psycholo- 
Self Cadet gists? 
scores scores scores 
Extrapunitive: 
Projective scores 16.92** 17.75% uaga 
Self scores 3.37% ae 
Other Cadet scores 4.52% 
Intropunitive: 
Projective scores 1.26 3.92** 93 94 
Self scores 2.00 389+% 
Other Cadet scores 7.02** 
Impunitive: 
Projective scores 20.65** 16.93**  16,65** 
Self scores 1.35 1 


Other Cadet scores 


* Significant at .01. 
ak Significant at .001. 
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Discussion 


The small but significant difference between 
the scores obtained from self-matchings by 
other cadets on the Rosenzweig Extrapunitivé 
Category is in the direction of agreement WI 
the original projective extrapunitive ne 
score and suggests that a more accurate oe 
ing may be obtained by scoring by others. 
is suggested that this increased accuracy E 
due largely to the fact that the other cade ý 
being less personally involved in the er 
ing, are not as prone to be defensive in me 
their comparisons. Counteracting this, 


Table 6 


: r 
Pearson r Correlations Between Scorers fo 
E, I, and M Categories 


(N = 88) 
Other P ie 
Self Cadet pn 
scores scores 

Extrapunitiye: 83 
Projective scores 71 83 73 
Self scores 96 72 
Other Cadet scores 

Intropunitive: 47 
Projective scores .23 A 33 
Self scores “a 44 
Other Cadet scores 

Impunitive: 57 
Projective scores -63 36 2 
Self scores an 52 
Other Cadet scores 


; who 
ever, is the fact that the individua ase 
matches his own response had an A e bas 
Over the independent observer in tha ning 
a greater knowledge of the true De 4 
the statements. Such underlying mvetect able 
Sarcasm and irony are frequently une ig of the 
to an outsider from the mere wording j 
statement. the 

The significant improvement of a5 0 
chologist group over the mean Æ scor t either 
“Other Cadet” group may be the resu lity ©’, 
of increased understanding of p erson rsonality 
namics, general familiarity with pe" sy’ a 
inventories, or practice effects (each Fimate 
ogist matched the responses of appro 
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15 
| Ss as compared to one per S by the “Other 


» Cadet” group). 
| n Pa shows considerable same- 
almost oa scores, all three groups having 
Proximate th ical results which very closely ap- 
Version of 5 scores obtained on the objective 
significant] he test. All three groups placed a 
i y larger number of responses into 


IS cat ren 
€gory than actually occurred as indi- 


Cate, 
Cee Projective scores, The reason for 
Tepresent rp is not clear, although it may 
Sons Site iness on the part of all per- 
™Munitive the tests to view a response as 
Punitivens unless the indications of extra- 
Ue clear-cut or intropunitiveness were strong 
peen a catch, ry other words, it might have 
either E for responses not falling clearly 

e tabl the other two categories. 

that se € of correlations (Table 6) shows 
i Stine Considerable agreement among 
| acess the r = the Extrapunitive category. 
nt tteoan, se ranking of individuals, on 
ug) ab ry at least, remained high even 
Stab] Solute size of scores varied consid- 


ly a 
tiong mong the scorers. Impunitive correla- 


troppo ved moderate agreement and the 
“Mb for thae Showed the smallest relation- 
i The dit different scoring groups. | 
eri Pit feed the sentence completion test 
ae twe, Icate that significant differences 
a those E the matchings of Ss themselves 
segs table i other Ss in the direction of un; 
deps a in this case “maladjustment 
Beat Sivens a smaller tendency toward e80- 
cree as S on the part of other Ss is sug- 
nce ip 2 Possible explanation of this dif- 
Scoring, 
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Umer Summary 
Object attempts have been made to de- 
ig ord. Versions of standard projective 
Dene ach er to overcome the limitations 
(0) tly Method suffers when used inde- 
wi S asc a investigation was undertaken 
ab, q Chces tain the extent and nature of the 
| wan (8) 4 Which such transformations incurs 
! dh Somay St the hypothesis that a partia 
tey bject 10n of g projective technique into 
t abled ve form would yield scores whic 
Signig, 8° Of the original projective test 
cantly greater degree than 4 fully 
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objective version. A projective test of reac- 
tions to frustration and a sentence completion 
test were used. 

The results obtained indicate that signif- 
cant differences occur between projective and 
objective versions of the same test in which 
the objective forms yield scores which ap- 
proximate “socially acceptable” responses to 
a considerably greater extent than do the pro- 
jective versions. It was also found that the 
scores derived from a partial objectification 
of the projective technique resembled “socially 
unacceptable”, responses (extrapunitiveness 
and maladjusted), as obtained on the projec- 
tive version of the test, considerably more 
than fully objective adaptations. Among the 


possible explanations for these differences in 


scores the following were suggested: (a) The 
“fakability 


» of a test decreases with the 
amount of projective potential characteristic 
for a test form, the more projective forms 
being the less fakable; (b) the te of the 
multiple-choice responses on the objec m y 

‘ nay have led to a consistent rejection 0 
caren’ nacceptable responses; (c) the pre- 
socially © less on a projec- 


+ ocition to cheat may be | 
= en taa on an objective form because 
ivi 


secti not require an indi- 
ective form does | : 
oles select inaccurate, restricted self de 
vi 


ae « projective technique is 
sete 7 A ntal solution to the 
as which combine the desired fea- 
the projective techniques ae 7 
and ease of scoring of objectiv! t 


ing is also in- 
acy of scoring 1S 
oaa oe other than the ones 


form of the test match 
tang a with the multiple-choice form 


need for tes 
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ae Present study was undertaken to de- 
tin ne whether projective responses can dis- 
neh between different degrees of the sex 
a When the latter is not artificially in- 
onstra In a study by Clark (4), it was dem- 
i that sex responses on the TAT 
ee Apperception Test) are influenced 
elore presentation of pictures of nude women 
such a the experiment. However, although 
gree study has the advantage of a high 
questio of experimental control, it raises the 
terna] ited Whether a drive aroused by ex- 
= mulation functions in the same man- 
nal stint, produced predominantly by inter- 
Spect nulation, One consideration, in this re- 
Sti ü 1s that experiments utilizing external 
e R are apt to induce set effects that 
lever founded with drive state, although 
this eception, as used by Clark, can reduce 
Unel posibility, A second consideration, not 
ext ed to the first, is that drives produced 
e stimulation are more apt to be 
ceed labeled. Finally, a drive aroused by 
thay v2! stimulation is apt to be more acute 
ation © Produced by mounting internal stim- 
fra] It is obviously the relatively enduring 
fy ally produced drive states that are of 
test amental concern to the user of projective 
berg.” Who views transient states produced by 
i Experiences as sources of error. 
© means of obtaining different degrees of 
Strength used in studies on hunger and 
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thirst (1, 3, 5, 7, 11, 12, 16, 18, 19) has been 
to require abstinence for fixed periods. How- 
ever, in any study where abstinence is re- 
quired, cues as to the nature of the study are 
unavoidably presented. The importance of this 
consideration has been demonstrated in two 
studies where it was found that instructional 
set effects exerted more of an influence upon 
drive-related responses than did the drive state 
itself (5, 18). Consequently, the only sound ex- 
perimental evidence for a relationship between 
the hunger drive and food-related responses 
stems from studies where testing was done be- 
fore and after normal meal time and no clues 
were provided about the nature of the study 
(5, 6, 10, 13, 15, 16). Considering that, un- 
like hunger, there are no regularly prescribed 
times for sexual gratification, the best that can 
be done to estimate an individual’s sex drive 
may be to obtain information about his sexual 
behavior. In the present study, two measures 
of drive strength were investigated, one based 
upon reported rate of sexual orgasm, the other 
upon last orgasm in relation to rate. 


Method 


The Ss were 59 male students enrolled in 
an introductory psychology course at the Uni- 
versity of Massachusetts. Testing was done in 
two separate sessions, one group of 29 receiv- 
ing the Group Rorschach Test before the 
TAT, the other group receiving the tests in 
reverse order. After testing was completed, 
a questionnaire on sexual behavior was filled 
out. Following the questionnaire, slides of sex- 
ually attractive women were ‘presented, and S 
rated each for sex appeal. 
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ago; the highest rate consisted of three or 
more times per week, and was subdivided ac- 
cording .to whether the last orgasm occurred 
less than two days ago. All analyses of vari- 
ance involved two degrees of freedom for rate, 
one for satiation, two for rate X satiation, and 
42 for the error term. The data were inspected 
for skewness and for correlation of means and 
variances; in no case was the need for trans- 
formation indicated. 


Results 


Subjective Sex Ratings 


Scores consisted of ratings on the five-point 
check list of how sexually reactive S felt at 
the moment. The analysis of variance failed to 
reveal a relationship between either rate or 
satiation and self-ratings of sexuality, all F 
values being less than one. As might be ex- 
pected, the sex drive, in this respect, is differ- 
ent from the hunger drive, where high corre- 


and projective measures cannot be accounted 


for by a common relationship wi as 
sexuality, ip with subjective 


Thematic A Pperception Test 


k WE rai Sex. The range of scores was 

0 o 7, with a mean of 3,04. A Pearson 
product-moment Correlation of .89 (N= 49) 
ss oe eon reliability, Analysis 
o n et that rate was significant 
at the . leve (F = 8.31), and that no 
other source of variance approached signifi- 
cance. The mean scores for the three rates in 
ascending order were 1.12, 3.31, and 4.06, in- 
dicating a positive relationship between Mur- 
ray need sex scores and rate of orgasm. 

Appealingness of Sex Object. The range of 
scores was from — 5 to 7, with a mean of 2,35. 
The interscorer reliability coefficient was .84, 
Analysis of variance failed to reveal any source 
that approached significance. 

In a recent study on thematic apperception 
as a measure of physiological drive (6), there 
was some indication that negative scores pre- 
dicted in the same direction as positive scores 
and, therefore, should not be algebraically 
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summed. Accordingly, a second analysis was 
performed disregarding algebraic sign. In this 
analysis, rate was significant at the .01 level 
(F = 5.47). The order of the means, in as- 
cending order, was 2.56, 3.34, and 4.81, indi- 
cating a positive relationship between Appeal- 
ingness of sex object scores and rate of orgasm. 


Rorschach Content 


The Rorschach content scores could not be f 
analyzed by analysis of variance as the inci- 
dence of scorable responses was too low. Con- 
sequently, separate chi-square analyses for 
rate and satiation were performed. For rate, , 
a division as close to the median as possible 
resulted in a comparison of 35 Ss with a rate 
of two or less times per week and 24 with a 
Tate of three or more times per week. For 
satiation, where Ss were divided as close tO 
the median as possible for each rate, it was 
necessary to eliminate five Ss who fell exactly, 
at a median point. t 

The low incidence of occurrence in Po | 
categories precluded highly reliable discrimi- 
nation between groups. Following is the aa, 
ber of Ss, among the total of 59, who PIO t 
duced at least one response in a category: 
Sex imagery, 20; Human sex imagery, Ý» 
Animal sex imagery, 6; Popular sex ma 
11; Sex object, 12; Sex activity, 10. 

Following are the percentages for all z 
of the number of Ss, divided according 
rate, who produced at least one respons®, ex 
figure for low-rate being presented eee, 3 
imagery: 26% vs. 46%; Human sex imag KERA 


II scores 
to 


11% vs. 42%; Animal sex imagery: ge 
8%; Popular sex imagery: 179% vs. 21 a, v5: 


object: 14% vs, 29%; Sex activity: 1 i 
25%. Vater ene was used aor at 
squares. The only score which signific im- 
differentiated the groups was Human se* 
agery (.02 level). +o did not 
The division according to satiation i re 
result in significant differences on any: 


Picture Ratings 


, Scores were obtained by summing jcture> 
ings of sex appeal given to the three P! 
The range of scores was 3 to 15, with a that 
of 9.77. Analysis of variance indicated 2 
rate was significant at the .001 leve rian? 
8.89), and that no other source of V4 


the rat 


| 


| 


Measures of Sex Drive 


A significance. The mean scores for 
ea ree rates in ascending order were 9.13, 
on n and 10.69, indicating a positive rela- 
o ship between rating the women in the pic- 

res as high in sexual appeal and rate of 
orgasm. 


Discussion 


pate major finding in the present study was 
at in three separate measures, thematic ap- 
‘ieee Rorschach content, and ratings of 
a nal appealingness of women whose pic- 
end were shown, a direct relationship was 
es das: sexual responses and reported 
findin sexual orgasm. However, the Rorschach 
the es was somewhat questionable in view of 
206 umber of scores investigated. In this re- 
Ea 1t was evident that the Rorschach test 
diffe ot be a sensitive measure of individual 
pete in sex drive as it elicits too few 
Daten, responses. If the present findings are 
Say , it would indicate that human-related 
Site en on the Rorschach are more pre- 
lated e of total sexual outlet than animal-re- 
responses. 

a a finding of a direct relationship between 
aad of sex drive and sex responses to 
study ive techniques is not in accord with a 
When os by Clark (4). In Clark’s study, 
Stance esting was done under similar circum- 
grou S to the present study, a drive-aroused 
Soni ewig fewer sex responses than a 
af Le group; only when testing was done 
found beer party was a positive relationship 
the ty, The most obvious difference between 
‘a Wo studies is the manner in which drive 
y \geetermined. Clark induced the sex drive 
sin Owing Ss pictures of nude women and by 
i 8 an alluring female examiner. In the pres- 
oa tudy, two noninduced measures of drive 
since investigated, rate of orgasm, and time 
a last orgasm relative to rate. Only the 
r ie was found to be related to projective 
ate Nses. A possible explanation 1s that sex 
igo is determined by the degree to 

i ae the physical expression of the sex a 
ri warp table to the individual, while Clark's 
fa 0n of drive did not involve acceptance 
nd; € drive. In addition, as has already been 
“Cated, a drive induced by external stimu- 
hie? is more apt to be associated with in- 
Itory reactions than one that is predomi- 


l 
c 
0 
j 
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nantly inwardly determined as it tends to be 
more acute and is more readily labeled (6) 

A difficulty inherent in most studies on drive 
strength is that the intensity of the drive must 
be inferred from antecedent conditions or con- 
current relationships which do not bear a di- 
rect relationship to the impulse. The present 
findings are particularly vulnerable on this 
point, and might most cautiously be inter- 
preted as simply having demonstrated a rela- 
tionship between projective responses and an 
external criterion. 

In regard to further work on projective re- 
sponses as measures of drive strength, anchor- 
ing the drive in a physiological state rather 
than relying upon verbal report would be an 
obvious improvement. Such an approach is 
currently being explored with women Ss by 
relating sex responses to phase of menstrual 
cycle. Another promising line of investigation 
is suggested by recent work which offers an 
approach to evaluating inhibitory and drive 
reactions as well as their interaction (6). 


Summary 


Fifty-nine college males were given three 
projective tests, a test of thematic appercep- 


the Group Rorschach Test, and pictures 


tion, 
who were rated for sex 


of attractive women 
appeal. In order to measure sex drive, a ques- 


tionnaire was anonymously filled out with in- 
average rate of sexual orgasm, 
number of days since last orgasm, and sexual 
responsivity at the moment. Two measures of 
drive were investigated, one based solely on 
rate; the other on satiation, as determined by 
days since last orgasm, relative to rate. The 
major findings may be summarized as follows: 


1, Subjective judgment of sexuality was not 


ificantly related to rate or satiation. 

onse scores on all three pro- 
directly and significantly 
but none was related to 


formation on 


sign 

2. Sexual resp 
jective measures were 
associated with rate, 


satiation. : 
3. Rorschach séx content cannot be a sensi- 


tive measure of drive as too few such responses 
are elicited. In view of the number of Ror- 


schach comparisons made, the results on this 
test require verification. 
4. On a thematic apperception score, Ap- 


pealingness of sex object, responses describing 


478 Seymour Epstein and Richard Smith 

9. Kinsey, A. C., Pomeroy, W. B., & Martin, C. E. 
Sexual behavior in the human male. Phila- 
delphia: Saunders, 1948. 

. Lazarus, R. Sọ, Yousem, H., & Arenberg, A. 
Hunger and perception. J. Pers., 1953, 21, 312- 
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Unsuccessful Differential Diagnosis 
from the Rorschach’ 


Stewart G. Armitage and David Pearl 
VA Hospital, Battle Creek, Michigan 


Mia ioni articles appearing in the litera- 
onshin attempted to demonstrate a rela- 
ous ps r etween Rorschach findings and vari- 
pie niatric diagnoses. Investigators report 
to eine dias in relating test characteristics 
charact s diagnostic categories. Some of these 
orien eristics are the presence Or absence of 
relativ, of the Rorschach determinants, their 
adher e strengths, patterns, ratios, and their 
ods oni to acceptable criteria. Other meth- 
of the ve relied more heavily upon the content 
others — and its characteristics, while still 
minant nave employed both content and deter- 
The s in various combinations. f 
as in ees validation of these findings 
o aiie given rise to a still greater number 
isagr es, among which there 1s considerable 
isa eS Some have indicated that there 
sc a oe existent between certain Ror- 
ile patternings and psychiatric diagnoses, 
etti others report findings seemingly dia- 
E E opposed. This confusion has been 
Popul sly attributed to differing psychiatric 
vations, varying methods of Rorschach 
Tanistration and scoring, and the question- 
Cig Ee of the psychiatric diagnosis as a cri- 
atten hese difficulties are recognized and an 
them Pt is being made in this study to avoid 


the use of the 
diagnostic tool. 
t effective use lies 


Sec clinicians object to 
“Aey ach primarily as a 
h PA out that its mos ve t 
Progno areas as personality description, its 
Rent Stic value, and its indications for treat- 
l bilities The diagnostic impression 
Prog sidered to be a rather unimportant by- 
uct., In numerous psychiatric hospitals the 


1 

F; i 

tle nee the Veterans Administration Hospital, Bat 
eek, Michigan. 


diagnosis is still important, and despi 

objections the Rorschach is praise 
diagnostic determination. It becomes impor- 
tant then to specifically investigate the poten- 
tial of the Rorschach for such use, to ascertain 
those aspects of Rorschach utilization which 
contribute to successful diagnosis, and to de- 
termine whether or not these aspects vary 


with differing types of patients. 
Method 


Procedure 

All Rorschach records were obtained from 
patients referred to the Psychology Service for 
testing by members of the hospital admissions 
board. Patients who were too ill or whose clin- 
ical manifestations were sufficiently clear-cut 
so that a proper diagnosis could be made 
solely on that basis were not tested. For this 
reason, the sample does not represent a cross 
section of the hospital population. Rather, it 
refers directly to those types of problems 


which require diagnostic assistance. The psy- 
chologist’s diagnostic impression was derived 
Jligence Scale, 


from the Wechsler-Bellevue Inte! 
hach test results, and what- 


Form I, the Rorsc i J 
ever cues he may have obtained from patient- 
examiner interaction during the testing Ses- 
sions. All diagnoses based upon emer 

ta were made prior to the fina jagnos- 
ee reed in 80% of 


ic staffings of patients and agre 
a F ith the final staff diagnoses. Rec- 


instances W1 
ords were selected from & large pool of ap- 
1,000 cases which were collected 

For this study, only 


eriod. 

tained which agreed com- 
Jetely with independent , impressions ob- 
pees. trists, based on their 


tained 
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clinical observations and the history of pa- 
tients’ illnesses, and with the classification 
made at the final diagnostic staff. All records 
utilized were rescored in accordance with 
Beck’s (1) current scoring standards and were 
considered sufficiently extensive in terms of 
number of responses and detail of inquiry to 
permit an adequate evaluation. 
Two approaches were made to the analysis 
of data. The first attempted to determine a 
relationship between Rorschach determinants 
and specific psychiatric classifications such as 
paranoid schizophrenia, unclassified schizo- 
phrenia, neurosis, and character disorder, The 
second was based upon the judgments of these 
same four diagnostic categories made by five 
staff psychologists with four to nine years’ di- 
agnostic experience with the Rorschach test. 
The records to be judged were randomly 
drawn from the larger sample employed in the 
first approach with the proviso that none of 
these were originally obtained by any of the 
judges. All identifying materials were re- 
then divided 
equally into two groups with 60 cases in each, 
In the first of these, the Psychograms and pro- 
tocols were separated, and scoring and loca- 
tion designations were removed. Records to be 
judged were assembled in groups of ten so 
that each assembly included five records from 
each of two of the four diagnostic categories 
which were to be evaluated. In this manner 
all combinations of diagnostic Classifications 
were obtained. Each judge made 180 judg- 
ments, 60 under each judgmental situation. 
Judges were aware that the records were 
equally divided among the four diagnostic 
groups. They were, however, warned that each 
of the groups of 10 records would not be di- 
vided equally among the four diagnostic cate- 
gories. Each judge received only one record 
group per day, since it was believed that this 
procedure would relieve boredom and mini- 
mize any attempt to distribute judgments 
equally among the four diagnostic categories, 
Judges were instructed to place each of the 
psychograms, protocols, or combination of the 
in one of the four diagnostic classifica- 
ae ; employing any method desired and 
oe any cues available from the presented 


aterials. They were requested to keep no 
‘ert 5 
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record of previous placements since this might 
influence their judgmental processes. 


Subjects 


The sample was drawn from World War II 
Patients ranging in age from 20 to 45 yon 
Brain damaged or epileptic patients were ex- 
cluded as were those who received electro- 
shock, insulin coma, or chemotherapy treat- 
ment prior to their psychological test evalua- 
tions. A total of 809 records was used in the 
first approach which investigated the rela- 
tionship between Rorschach determinants 
and psychiatric classification. This sample 
included 140 paranoid schizophrenics, 341 
unclassified schizophrenics, 243 neurotics, an 
85 character disorders. These Ss were reason- 
ably well matched for education, age, and 1Q. 
The following are means and standard devia- 
tions for each group: Paranoid Schizophr 
nics: 10.37 years of school (SD 2.68), 30-4 
years of age (SD 5.34), and 105.11 IQ CF 
14.19). Unclassified Schizophrenics: ie 
years of school (SD 2.43), 28.97 years of age 
(SD 6.20), and 101.77 IQ (SD 14.99)- a 
rotics: 10.24 years of school (SD 2.36); SD 
years of age (SD 6,38), and 108.71 IQ ( 5 
12.29). Character Disorders: 9.97 years 
school (SD 2.34), 29.25 years of age 
6.42), and 106.58 I SD 11.86). a Y 

It should be Stes ba that the test dina 
ses and the classification arrived at in the n 
hospital diagnostic staff agreed perfectly. ns 
the instance of the other two classificato ; 
Some intra-category variance was P oT 
for example, testing for a specific patien dis- 
have suggested a diagnosis of charac a: the 
order, passive-aggressive reaction, Whi macter 
final staff diagnosis might have been E in- 
disorder, Passive-dependent type. In thi 
Stance, however, there was agreement in 
Classification of character disorder. a isowe 
type of intra-category deviation was allo 
within the neurotic category. cases 

For the judgmental approach, 120 mple 
were randomly drawn from the larger 54! 0 
of 809 Ss. This sample was composed “cate 
cases from each of the four diagnostic & e 
gories. These were closely matched ae 
and IQ and approximately for educati 
level and occupation. 


ame 
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Results 
Objective A nalysis 


For the objective analysis, 28 factors were 
analyzed. These were: Total number of Tre- 
Sponses, W, poor W, D, Dd, H, Hd, H plus 
Hd, poor H plus Hd, sexual responses, P, S, 

» blends, rejections, content categories, color 
responses, M, poor M, animal movement, in- 
animate movement, sum of human, animal 
and inanimate movement, Y, poor Y, F%, 
4%, An, Hd > H, and hostile responses 
(characterized by violence, explosion, blood, 
fire, and other destructive forces). Chi-square 
analyses were computed between the four psy- 
chiatric groups for each of the above factors. 
Significant differences between diagnostic 
Stoups were found in only nine of these vari- 
ables and are presented in Table 1. 

Attempts to further refine these factors 
failed to be of value as they occurred too in- 
frequently to be of diagnostic use. Frequency 
distribution curves of the preceding distribu- 
tions were plotted for each diagnostic cate- 
gory and frequency cutoff points were se- 
aed where such curves showed their great- 
a divergence. Various groupings of these 
Gar then applied to 100 randomly chosen 
ted, of the original sample, equally distrib- 

among the four diagnostic categories, to 
ctermine whether the diagnostic structure of 
this sample could be predicted. However, 
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only 31% of cases were accurately diagnosed 
by this approach, ranging from 28% for char- 
acter disorders to 34% for unclassified schiz- 
ophrenics. Use of determinant percentages 
failed to produce results differing from those 
given above. 

Since it was possible that certain determi- 
nant distributions on specific Rorshach plates 
might be of diagnostic value, a card-by-card 
analysis was undertaken. Some of the previ- 
ously utilized Rorschach factors were omitted 
since they occurred too infrequently. Results 
indicated that F + %, P, An, Hd, H, and W 
characteristics of specific Rorschach cards 
were significant in differentiating various di- 
agnostic groups. Again, when cutoff points 
utilizing patternings of these card determi- 
nant relationships were applied to 100 ran- 
domly drawn cases from the original sample, 
placement into diagnostic categories failed to 
be significantly better than chance. 


Judgmental Analysis 


Thirty records from each of the four diag- 
nostic groups, a total of 120 Rorschachs, were 
drawn from the original sample of 809 cases. 
As previously indicated, psychograms and 
protocols, separated in half of the sample, 
were judged independently. In the remaining 
half, the psychograms and protocols were 
judged together. 


Table 1 
Rorschach Factors Significantly Differentiating Between Diagnostic Groups 
Number 
Cc 
Diagnostic Group -W Hd Hd>H Sex P Hostile Responses F+% An 
Unc] Schi 
+ Sch; i 
ar. Sic NS* ».03 p05 NS NS po NS Noon Ne 
Nel. Schi i X 
ee a G NS p.02 NS %0% p.01 NS p.Ol po NS 
Cl. Schi i 

P See isi p02 NS p02 p01 NS NS NS ~.001 NS 
* Schizophrenia vs. 

P She piae o p.01 p -02 NS p.03 NS p.01: p 03 NS ~.01 
T. Schizophreni à 
N p02 p02 NS p01 NS p02 NS NS p05 
Urotic y Y 

o i Disorder NS NS NS NS NS NS NS NS NS 

Ver-al] p01 pol p Ol p Ol p.01 p Ol p 02 pol 02 
= 


* aioe 
Not significant, 


482 Stewart G. Armitage and David Pearl 
Table 2 
Correct Diagnostic Classifications of Judges Under Three Conditions of Judgment 
P: id Unclassified e 
Schizophrenia Schizophrenia Neurotic Character Disorder All Diagnoses 
Proto- Psycho- Com- Proto- Psycho- Com- Proto- Psycho- Com- Proto- Psycho- Com- Proto- Psycho- Come 
Judge cols grams bin. cols grams bin. cols grams bin. cols grams bin. cols grams 
18 
8 6 7 4 4 4 2 8 3 4 4 4 18 22 
= 4 4 5 4 4 5 5 6 5 7 6 4 20 20 1 
C 7 3 4 5 4 6 4 5 7 1 1 4 17 13 22 
D 5 if 5 6 6 7 3 6 4 5 6 6 18 21 at K 
E 4 5 7 4 2 3 3 6 5 6 1 6 17 14 
All 90 101 
Judgments 28 21 28 23 20 25 16 31 24 23 18 24 90 9 


For this aspect of the study, the following 
are considered: (a) Accuracy of judgments 
for the four diagnostic categories and again 
when combined into overall psychotic and 
nonpsychotic groups (these are compared 
under the conditions when judgments are 
based upon psychogram or protocol alone or 
a combination of the two); (b) judgmental 
bias; and (c) judgmental reliability. 

Accuracy of judgments. Data showing the 
accuracy of judgment under the various con- 
ditions for judgment are shown in Table 2 
and the distribution of all judgments is to be 
found in Table 3. Simple and multivariate 
analyses of variance failed to disclose any sig- 
nificant deviations from chance expectancy for 
the number of correct judgments made by 


diction of the neurosis and that the protocols ` 
permitted a somewhat more accurate JU a | 
ment of paranoid schizophrenia. When JU p 
ments were considered for the prediction ne 
Psychosis (paranoid and unclassified we 
Phrenics combined) or nonpsychosis, analy: i- 
of variance disclosed that these were ere 
cantly better than chance (p < .05)- No ns 
nificant differences were found between ugh 
of the three conditions of prediction, er 
more correct judgments were obtained W3C 
Protocols and psychograms were utilize 
combination. i A 
A possible factor entering into ge S 
tial judgmental situation is the similari A 
or dissimilarity of the data to be juon pe 
any one time. If one assumes that Rors¢ 


3 s ” Sii cho- 
era fer o = a any of the diagnostic within a diagnostic group or within a es 
c s regardles i . £ 
S T a of whether psychograms, tic-nonpsychotic category have greate hachs j 
protocols, or comi inations of the two were larity among themselves than to Rors¢ en 
used. Furthermore, no Significant differences of other diagnostic categories, then Joem 
— st between judgments based on of a series of similar cases should be jmilat 
either psychograms, protocols, or both, Al- difficult than those of a series of diss tie j 
a Sie Sue e 
though missing the criteria of significance, cases. This difficulty would be accentuato ae 
some indications were present that the psy- the judges had the expectancy that rep” a 
chograms were somewhat better for the pre-  tatives of any or all categories might be P 
Table 3 
Relationship of Actual Diagnoses to J udged Diagnoses by Five Judges i 
Judged Diagnoses F 
Paranoid Unclassified isos de 
Schizophrenia izophrenia Neurotic Character z Gon y 
: Psycho- Proto- Com- Psycho- Proto. Come - Psycho- Prop“ bis ' 
Actual Diagnoses grams cols bin. grams cols bin, pthc: er on grams ols r 
: - ; 21 28 28 29 5 By 3 14 1 B 
ranoid Schizophrenia . 19 24 28 20 a 4 A H ; 
Daclassified Schizophrenia 15 19 is 15 z i E is 24 š 14 R a 
Neurol, pigorder an o Be apeu BS 
Fs 70 90 89 79 83 77 96 60 75 55 


All Subjects 
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ent in the data to be evaluated. For this rea- 
son, as previously pointed out, cases were as- 
sembled so that all combinations of categories 
were represented in the various blocks of data 
to be judged. Some were thus composed of 
both Psychotic and nonpsychotic records while 
others contained only psychotic or nonpsycho- 
tic data. Analysis of variance, however, indi- 
Cated that regardless of the combination of 
Categories judged, no significant differences in 
the number of correct diagnostic predictions 
Were present among the various combinations. 

Judgmental bias. The possibility that the 
type of setting in which a clinician works 
might bias the kinds of diagnostic judgments 
he might make was investigated. A compari- 
Son of placements within the four diagnostic 
Sroups (see Table 3) by analysis of variance 
Showed no significant differences between 
Judges in utilization of a particular diagnosis. 
Similar analysis when data was regrouped into 
Psychotic and nonpsychotic categories dis- 
Closed that in the instance of the combined 
Psychogram-protocol condition, a significantly 
8reater number of cases (p < .05) were clas- 
Sified by each judge as psychotic. Although 
falling short of significance, a similar tend- 
ncy toward disproportionate judgments of 
Psychosis was found for the other two bases 
of judgment. 

Judgmental reliability. Reliability of judg- 
Ments is presented from the point of view of 
the diagnostic correspondence of the judg- 
ments made separately from psychograms and 
Protocols of the same patients and from over- 
a interjudge diagnostic agreement. Chi- 
Square analysis of the 60 cases where proto- 
Cols and psychograms were independently 
Judged failed to disclose a significant corre- 
SPondence of judgment by judges, considered 
Individually or collectively, with respect to 
Specific diagnoses or the psychotic-nonpsy- 
Chotic dichotomy. Interjudge comparisons for 
cach judgment condition likewise failed to 
Show significant agreement among judges with 
Fespect to any single diagnostic category. 

nalysis, however, indicates that when only 

2€ Psychotic-nonpsychotic dichotomy is con- 
Sidered, agreements are significant at the ? 
SN .05 level. If the criterion of concensus is 
Set as agreement of any three of the five 
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judges, both agreements with respect to spe- 
cific diagnoses and the psychotic-nonpsycho- 
tic category rose considerably, the former 
being significant at the  < .05 and the lat- 
ter at the p < .01 level. However, agreements 
were yet far short of reliability levels neces- 
sary for individual case predictions. 


Discussion 


The results of this study are disappointing 
from the standpoint of diagnostic predictions 
when Rorschach data is employed in the ab- 
sence of additional test materials. Findings 
are quite definite in that Rorschach data in 
isolation do not offer a sufficiently broad judg- 
mental base for consistent individual predic- 
tion, either within specific diagnostic catego- 
ries or within the psychotic—nonpsychotic di- 
chotomy. 

In recent years, increasing emphasis has 
been put on the importance of content in the 
Rorschach. One might therefore have assumed 
that Rorschach protocols which furnish such 
data might enhance predictability, an assump- 
tion not supported by the results of this 
study. But perhaps this study is an unfair test 
of the usefulness of the Rorschach in that it 
does not duplicate the typical diagnostic pre- 
diction situation. In the typical situation, the 
evaluater may have available other psycho- 
logical test data and cues from interpersonal 
contact with the patient and his reaction to 
the testing situation which may serve to am- 
plify or modify conclusions from the Ror- 
schach data. 

The study, however, indicates the lack of 
consistency among judges in the diagnostic 
interpretation of identical data. If the inves- 
tigation had concerned itself only with pre- 
dictions in which judges expressed confidence, 
consistency might have been greater. — 

Similar negative findings from the objective 
analysis perhaps were to be expected, since 
numerous objections may be raised to the 
validity of any procedure which would com- 
pile determinants .to arrive at a diagnosis. 
Some significant relationships of detérminants 
to diagnosis were found, but such significance 
was largely due to the size of the sample em- 
ployed and was not of value for making in- 


dividual predictions in this study. 
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Summary 


The consistency with which individual or 
group diagnostic categorization can be pre- 
dicted from the Rorschach was investigated 
in two ways; one was an objective statistical 
approach and the other a subjective judg- 
mental approach. In the first, an attempt was 
made to relate statistically either single or 
patterned Rorschach determinants to previ- 
ously made diagnostic judgments. The results 
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failed to uncover any useful means of arriving 
at a diagnosis. The judgmental approach was 
found to be equally unsuccessful in achieving 
consistent diagnostic predictions. 
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of A au sh concerned with the problems 
‘net ig ain of the future performance of 
a uals have, in recent years, become in- 
oe explicit in acknowledging the ne- 
Ta of taking personality factors into ac- 
14) RE predicting future achievement (9, 11, 
Prob] ong with this growing interest in the 
ables 41 of the influence of personality vari- 
Sudi there has been an increase in empirical 
Bane of the relationships between scores on 
ures o mg pencil personality tests and meas- 
crage achievement such as grade point av- 

ae 6, 9, 11, 12). 
studied the personality characteristic most 
defined in recent years has been anxiety (as 
iety 5 Py score on tests like the Taylor Anx- 
6 iie e). Since this variable has been shown 
tory oe performance in numerous labora- 
Eran uations (1, 4, 7, 8, 10, 13), it is rea- 
tendi e that there would be interest in ex- 
life” ng our knowledge of its effects to “real 
e” behavior such as academic achievement. 
= respect to the Taylor scale, the results 
lect have been disappointing. The most 
Cite le studies reported in the literature indi- 
e a that level of anxiety has no demonstrable 
P F on academic achievement (9, 12). This 
te rather surprising result in view of the fre- 
nt clinical obseryation that high anziety 
in achieve 


E 
s 
Ment, to a breakdown or decline 


Th j 
Refined sat of this apparent failure of test- 
ti Would ely scores to predict achievement, 
tyes as Wines fruitful to re-examine OUr po 
this A what anxiety scales are measuring: 
oy The Bard, it seems that those of WW" 
s- Wri . ` 
thi his ae is indebted to Seymour B. Sarason 
Dore eae in securing the data on which 
ase 


sed, 
48 


have worked with tests like the Taylor scale 
have ignored one very important observation 
made by clinicians and laymen. That is, that 
people are not anxious every minute of the 
day and that often we can specify the condi- 
tions which will lead to an increase in anxiety 
in the individual. Perhaps what we need are 
not general anxiety scales oriented towards 
the kinds of anxiety responses (e.g., sweat- 
ing, awareness of an increase in tension, etc.) 
which an individual will admit to but, rather, 
tests designed to assess the specific condi- 
tions under which anxiety is aroused—or, of 
course, perhaps we need some combination of 


both. 

The present study was designed to evaluate 
the role of anxiety in academic achievement 
when anxiety is defined as a general charac- 
teristic and also as one specific to a particular 
situation. Scores on two questionnaires de- 
vised by S. Sarason (3, 5), a general anxiety 
scale and a test anxiety scale, were used to 
select extreme anxiety groups. College en- 
trance examination scores and grade point av- 
erages were used as response measures. It was 
expected that the test anxiety scale might be 
of more predictive utility in this situation 
than the general anxiety scale because of the 

sive assessment 


test anxiety scale’s more inten 
s and their antecedents in 


of anxiety response ntecedents i 
e situation, the testing situation: Evi- 
dence presented by Sarason and Mandler (1) 
and Gordon (2) seems to indicat hat, < 


a direct compari- 
e two kinds 
) has as yet 
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not been attempted. In the present study, it 
was possible to make this kind of comparison. 
A word should be said at this point of the 
way in which we shall use the term anxiety 
and the kinds of predictions this definition 
leads to. Our knowledge of the role of anxiety 
on behavior is at present quite meager, even 
in relatively simple experimental situations, 
There is, however, a growing body of evidence 
which can be interpreted as suggesting that 
individuals obtaining high scores on anxiety 
questionnaires differ from other subjects (Ss) 
in the extent to which their performance is 
disrupted under conditions of stress (4, 5, 8, 
10, 11). It might further be inferred that an 
individual’s performance is disrupted to the 
extent that he brings to novel situations an- 
ticipations of failure, rejection, and inability 
to cope with the requirements of the situa- 
tion. If this were so, high anxious Ss would 
be regarded as emitting these interfering task- 
irrelevant responses (e.g., self-verbalizing, T 
can’t pass this test”) to a greater extent than 
do other Ss in the anxiety score distribution. 
In the case where certain Ss admit to these 
interfering responses we would expect a lower 
level of performance for them than for other 
Ss in the score distribution, While the rela- 
tionships existing between general anxiety 
(i.e., anxiety experienced in a wide variety of 
situations) and test anxiety (anxiety specif- 
ically in testing situations) are as yet far 
from clear, we would expect that Ss with high 
test anxiety scores would do relatively more 
poorly than low test anxious Ss in their per- 
formance on important entrance examinations 
for which they could not prepare. Unless gen- 
eral and test anxiety were very highly cor- 
related, we would not necessarily expect in- 
dividuals admitting to anxiety in a variety of 
situations to be anxious in a testing situation. 
In any event, we would expect smaller differ- 
ences on entrance examinations for extreme 


general than for extreme test anxiety groups. 


A further prediction was made with respect 
to the grade-point averages of extreme test 
anxiety groups. On the assumption that high 

‘ety leads to performance decrements in 
er Minatians for which preparation is not 
à it was expected that high and low 
would not differ in grade 


novel 
possible, 
test anxious groups 
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point average. This kind of result was ex- 
pected on the basis that in a particular ces 
room situation the deleterious effects of : 
novelty of test situations would decrease pe 
virtue of the increase in familiarity with t ‘ 
teacher, classroom, etc., over a period a 
semester. Another possibility leading to se 
same prediction is that high test a 
anxiety may not be reduced during a co a 
but that they may have sufficient time ba: 
the period of a semester to overlearn cours 
material. 


Method 


The Ss were 305 Yale University use 
arts undergraduate students. Most ole A 
Ss were administered the Test Anxiety (GA) 
questionnaire and the General Anxiety some 
questionnaire in the fall of 1953. w 
cases, only TA scores were obtained. e ad- 
cases, the anxiety questionnaires ieee 
ministered during introductory psyc these 
class meetings. At the time of ee were 
questionnaires, the great majority O to keeP 
sophomores or juniors. It is important resen 
this fact in mind in interpreting swe are 
results. Although the anxiety are no W 

r 
of determining whether or not be Tia 3 
sults as those presented here wou Fres 
tained if the Ss used were largely 
or seniors. fol- 

In the summer of 1956, as many gee i 
lowing measures as were available GA 


b 
corded for each S: (a) TA eae (aT) 
score, (c) Scholastic Aptitude ature)? 


scores (this test is largely verbal in "MAT? 
(d) Mathematical Aptitude Test de oint 
scores,? and (e) yearly course gta 
averages. 


A 
Available for the 305 Ss were TA, G4) 


u 

the f 
and MAT scores. However, grades an 27 0f 
years of college were available for OMY" cta- 


the 
these Ss.* Consequently, the Vs O 
tistical tests performed on the five 
listed above varied. 


SAT and MAT are examinations SiVM orsi 
Yale undergraduates upon entrance to th E is 

3 Clearly, then, the results concerning xo 
generalizable only to those students who 
four years of college. 
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Results and Discussion 


_ Imorder to determine whether or not the anx- 
lety measures could be of utility in predicting 
academic performance, correlations (Pearson 
7's) were obtained between the academic meas- 
ures and the GA and TA scores. Table 1 pre- 
Sents the results of these analyses. Significant 
negative correlations were obtained between 
TA and the two entrance examinations (— .14 
and — .20). However, since it appeared that 
TA might be curvilinearly related to SAT and 
MAT, epsilons were computed between TA 
and SAT and MAT and tests made for curvi- 
linearity. In the case of TA and MAT, the re- 
Sults of the statistical tests revealed that the 
hypothesis that TA and MAT are rectilinearly 
related could not be rejected. However, the 
departure from rectilinearity was significant 
(< 01) for the TA-SAT relationship. The 
value of epsilon between TA and SAT was 
found to be .26. It thus appears that high TA 
Scores do seem to be related to relatively poor 
Performance on achievement tests of the type 
Used in the study. On the other hand, the 
correlations presented in Table 1 between GA 
and the two entrance examinations were not 
Significant. Therefore, with respect to the 
T and MAT measures, the predictions con- 
cerning the performance of Ss varying in TA 
and GA scores were largely supported. 
Tt had been expected that Ss with high TA 
Scores would do relatively poorly on SAT and 
TAT but that this relative inferiority would 
sappear in course work as a function of 
either increased familiarity with or overlearn- 
Mg of subject matter on which they would 
be tested during the course. The results of 


Table 1 


Correlations Between TA and GA Scores and 
SAT, MAT, and GPA* 


Anxi- GPA 
ety 

Scale SAT MAT Yri Yr.2 Yr.3 Yr-4 
TA i4 zo —.14* —.17* —.06 —.003 
Ga cs a ee Oe 14" 


yi = 305. 
Pea Pearson y's involving SAT and MAT have Aai 
arson 7's involving GPA's have N = 227. 


* 
D <.0s, 
*p <i: 
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Table 1 support this hypothesis, but it seems 
to take high TA Ss longer to “catch up” with 
other Ss than had been expected. For the first 
two years of college, significant negative 7’s 
were obtained between TA and grade point 
averages. For the last two years these r’s fail 
to reach significance. 

A quite unexpected finding concerned the 
r’s between GA and grade point averages. It 
had been predicted that GA scores would be 
unrelated to grade point averages. This pre- 
diction was made on the basis that knowing 
that an individual is anxious in a variety of 
situations does not necessarily mean that he 
will be anxious in a testing situation. The high 
r of .55 obtained between GA and TA might 
provide some grounds for expecting a tend- 
ency for the results with respect to GA to be 
similar to those for TA. On the contrary, 
however, there seems to be a tendency for 
high GA scores to be associated with high 
grade point averages. This result gives strik- 
ing support to the contention that in dis- 
cussing the effects of anxiety on performance, 
it is necessary to be clear as to the situations 
in which Ss admit to experiencing anxiety. 

One aspect of the results summarized in 
Table 1 which should be kept in mind is that 
all of the 7’s except that between TA and GA 
are quite low. It is unlikely that r’s of such 
low order, even though they are significant, 
can be used for the purposes of prediction 
of the academic performance of individuals. 
However, such results can be of theoretical 
utility. Because of the encouraging results pre- 
sented in Table 1, it seemed of interest to 
further test the hypotheses made with respect 
to the effects of TA and GA on academic per- 
formance using extreme anxiety groups since 
our predictions stemmed from certain expec- 
tations concerning the performance of high 
anxious Ss. Consequently, the performances on 
SAT and MAT of two extreme groups of high 
and low TA Ss were compared. Table 2 sum- 
marizes these results. When the upper 6% 
(H1) of the 305-Ss`in the TA score distribu- 
tion was compared with the lower 7% (L1) 
of the TA distribution, the low TA group was 
significantly superior ($ < .01) to the high 
TA group on both SAT and MAT. When Ss 
scoring on TA between the 8th and 17th per- 
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Table 2 


Means and SDs for Two High and Two Low Test 
Anxiety Groups on SAT and MAT 


TA SAT MAT 
Group 
and Score N M SD M SD 
BEL 0-7 22 611.41 66.81 653.21 70.21 
L2 8-10 30 606.05 71.12 624.34 79.63 
H2 26-28 38 588.94 74.63 602.31 84.73 
Hi 29-35 18 543.92 63.75 555.03 68.06 


centiles (L2) were compared with Ss scoring 
between the 82nd and 93rd percentiles (H2), 
no significant differences were obtained. It 
thus appears that the TA questionnaire may 
be sufficiently sensitive only in selecting from 
a large group of individuals a small subgroup 
of Ss who perform in the manner predicted in 
this study. It is interesting that this result is 
quite consistent with the findings in other 
studies that anxiety questionnaires divide Ss 
into two groups: a small high anxious group 
and a group composed of Ss in the rest of the 
anxiety score distribution. In this regard, it is 
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Fig. 1. GPA as a function of years in college for four 
Be he TA groups. 
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relevant to note that the two low TA groups 
(L1 and L2) did not differ significantly on 
either of the two entrance examination meas- 
ures, but that the H1 TA group was signifi- 
cantly inferior (p < .05) to the H2 TA group, 
and that this latter group did not differ sig- 
nificantly from the low TA groups. When 
groups of high and low scores on the G 
questionnaire were compared with respect tO 
SAT and MAT performance no significant 
differences were obtained. 

In view of the correlations between TA 
scores and grade point averages presented im 
Table 1, one might expect, for the first two 
years, to find a superiority of low to high T. 
groups followed by no difference in the third 
and fourth years. Fig. 1 presents the curves 
of the H1, H2, L1, and L2 TA groups. Thesè 
curves represent grade point averages a2 i 
function of years in college for Ss scoring 


above the 91st percentile (H1), between te 


82nd and 91st percentiles (H2), below the 7 
percentile (L1), and between the 9th am 
18th percentiles (L2) for the 227 Ss on wee 
grade point averages were available for fona 
years. There are three observations to be pa g 
concerning Fig. 1. One aspect of it yi 
striking is the clearcut superiority of tue o 
group over the L2 as well as over the Ti 
high TA groups. This superiority of the =. 
TA group was found to be statistically 9% 
nificant. in 
Another interesting finding represented 
Fig. 1 is the general similarity in the curv? | 
the L2 group to that of the H2 group. It s 
been expected that the LA and HA ae 
would not differ significantly when grade ma ic 
average was used as the measure of acaden 
achievement. Thus, to the extent that the up 
group can be considered a high anxious gro l 1 
the similarity in the L2 curve to the 1 We 
2 curves is as was expected. Clearly; 1. 
ever, the superiority of the L1 grouP + this 
other three groups was not expected. E an- 
point one can only conjecture as to the be o5- 
ing of the superiority of this group. oe that 
sible that a test-taking attitude (¢8. “to 
tapped by the MMPI K scale) peculia’ ii 
L1 Ss may have contributed to their SUP" f 
Ority in grade point average. No meas” 58 
test-taking attitude was available for u ip 
used in this study. It would seem to be ° 
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terest to take this variable into account in 
future research. 

In addition to the unexpected superiority 
of the L1 group to all other groups whose per- 
formance is depicted in Fig. 1 is the marked 
difference in form of the H1 group curve from 
the other three. Most striking about this 
8toup’s performance is the great improvement 
In their performance in the second year fol- 
lowing an extremely poor first year showing. 
Tt had been expected that high anxious Ss 
Would perform as well as low anxious Ss in 
Course work on the assumption that during a 
Semester the high anxious individual can in- 
Crease his familiarity with the task, the in- 
Structor, etc., as well as possibly overlearn 
Course material. The curves in Fig. 1 suggest 
that the process of acclimatization for high 
anxious Ss may be considerably slower than 
We had expected. 

n order to follow up the unexpected sig- 
Cant positive correlations obtained between 
fee ee and grade point averages, two ex- 
ous high and two extreme low GA anxiety 
er were compared in terms of the four 
Prés Y grade point averages obtained. Fig. 2 
ing Se curves for the groups of Ss scor- 
tribut the upper 8% of the GA score-dis- 
ing ae (H1), the lower 8% (L1), Ss scor- 
An the upper 9th to 18th percentile range, 
(Ha) S in the 82nd to 91st percentile range 
hight of the GA score distribution. A large, 
G y significant, superiority of high to low 
four S is clearly in evidence throughout the 
with oe period. This result is not in accord 
an a Predictions made prior to carrying 
ing e study and, together with the surpris- 
Other Petiority of the Ll TA group to the 
Prob] TA groups studied, poses important 
UA for future esearch. At present, it is 
= A what individual difference variables 
ent ibuting to the superiority in achieve- 

all of both individuals who admit to virtu- 
vi kina anxiety in test situations and indi- 
Varieta who admit to great anxiety in a wide 
Y of situations. 


nifi 


di (er leat; however, that it is important in 
ance Tag the effects of anxiety on perform- 
ig © specify the manner in which anxiety 
€asured (i.e., by means of which instru- 
Veal; The results of the present study re- 
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cific situations in which an individual experi- 
ences anxiety if one is interested in predicting 
his future performance in specific situations. 


Summary 


1. The relationships of anxiety as meas- 
ured by the Test Anxiety (TA) and General 
Anxiety (GA) questionnaires to entrance ex- 
aminations and grade point averages were 
studied. 

2. TA scores tended to correlate negatively 
with measures of academic achievement, al- 
though with increase in number of years in 
college the negative correlation disappeared. 
GA scores failed to correlate significantly with 
entrance examination scores, but tended to 
correlate positively with grade point averages. 

3. In studying extreme anxiety groups, high 
TA Ss performed at a significantly lower level 
than did low TA Ss. In addition, significant 
differences were found within each of the low 
and high test anxious groups. Thus, on en- 
trance examinations the most extreme test 
anxiety group performed at a significantly 
lower level than did a group of high anxious 
Ss with less extreme TA scores, and for course 
grades, the Ss with the lowest TA scores per- 
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formed at a considerably higher level than 
did low anxious Ss with less extreme scores. 

4. The results demonstrated that relation- 
ships between anxiety and achievement vari- 
ables depend to an important extent on the 
nature of the instrument employed to measure 
anxiety. 


Received February 8, 1957. 
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Generalization as a Function of Manifest Anxiety 
and Adaptation to Psychological Experiments’ 


Sarnoff A. Mednick ° 


Harvard University 


By attributing drive properties to anxiety, 
as measured by the Taylor Manifest Anxiety 
Scale (MAS), this variable has been incorpo- 
rated into the Hullian framework (8). In this 
Context, defense conditioning investigations 
have indicated that individuals scoring high 
on the MAS condition faster than low scorers 
(5, 6, 7, 9). However, a study which utilized 
Classical reward training of the salivary re- 
Sponse, a situation relatively void of noxious 
mulation, revealed no differences between 
OW anxious (LA) and high anxious (HA) 
we (1). Work by Rosenbaum and Wenar 

n stimulus generalization (SG) suggests that 
under strong stress conditions the HA group 
8eneralizes more than the LA group (4, 13). 
rile when Rosenbaum used only mild 
e or buzzer, he failed to find a difference 

generalization between the anxiety groups. 
an findings suggest that the HA scorer on 
i MAS does not chronically carry his anxi- 
ty around with him, but that it must be spe- 
Cifically elicited by some stress situation. This 

interpretation has been noted earlier (7, 12). 

‘he Present study was undertaken to test for 

ifferences in SG as a function of scores on 
the Heineman forced-choice form of the MAS 
a situation containing no stress deliberately 


introduced by £. 
Method 


Apparatus 
a The SG apparatus wa 
€vised and described i 


orted in part by 
Social Relations, 


s adapted from one 
n detail by Brown 


*This study was supp = gan 
om the Laboratory of Haven 
Niversity. 

i ? Part of this study was complete 
Uthor was on a USPHS postdoctoral Co yi 


fr, 


are North pae) 

western University, Colleg e 

aM University of Ilinois, University of Chicago and 
ichael Reese Hospital. 


et al. (2). It consists of a plywood panel 
(6 ft. X 2 ft.), painted flat black, upon which 
is mounted a horizontal row of eleven lamps 
(115 v., 7.5 w.) spaced 9 degrees of visual 
angle apart. The lamps are designated 1 to 11 
with Lamp 1 being on S’s left and Lamp 6 
being the center lamp. The panel is curved so 
that all lamps are equidistant from S’s eyes 
when he is seated directly in front of Lamp 6, 
3.5 feet away. A red-jeweled pilot lamp, 2 
inches above Lamp 6 serves as a fixation point 
and a ready signal. A reaction key was placed 
on S’s preferred side and he was allowed to 
move it into a comfortable position. Response 
latency was measured to the nearest 1/100th 
of a second with a Standard Electric Timer. 
The stimulus board effectively hid both Æ and 


the equipment from S. 
Procedure 

Ss were sixty undergraduate volunteers from 
Northwestern University psychology classes to which 
the Heineman forced-choice form of the MAS had 
been administered and scored by Key 2 (3). On the 
basis of their test scores, high anxious (HA), me- 
dium anxious (MA), and low anxious (LA) groups 
of 20 Ss each (evenly divided as to sex) were estab- 
lished. The median MAS scores for the HA, MA, 
and LA groups, respectively, were 70, 57, and 44. 
The median score of the original Heineman sample 
was 54. 

The instructions informed S that Æ was in- 
terested in how fast they were capable of re- 
acting. It was explained that they were to 
react by lifting their finger from a reaction 
key when Lamp 6 was lit and to continue 
holding down the key when any of the pe- 
ripheral lamps were lit. Speed was stressed 
again and Ss were urged not to worry about 
accidental responses to the peripheral lamps 
but to simply proceed to the next trial. There 
followed 20 training trials to Lamp 6 with a 
10- to 20-sec. intertrial interval. The fore- 


period between ready signal and stimulus was 
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Table 1 
-Median Number of SG Responses 


Anxiety Groups 


All 
ee 
TA High Medium Low Cases 
5 6.5 
ted early 6.5 9.0 5.5 6.5 
A late 4.5 6.5 4.5 5.5 
All cases 5.5 8.5 5.0 6.0 


varied between 2 to 5 sec. 
ing the 20 trainin, 
generalization tes 
Presentations of 


Without any warn- 
g trials were followed by a 
t series consisting of four 
each of the 10 


ing arrangements with 
the MA group was tested in the first 
rth weeks of the quarter 


time) and the HA and the groups were 
tested in the sixth and ninth weeks of the 
ten-week quarter (ten Ss from each group 
each time), 


(ten Ss each 


Results 


Table 1 compares t 
SG responses 


groups to differ significant]: 


Table 2 
Mean Number of SG Lamps Eliciting Responses 


Anxiety Groups 
Test All 
Period 


High Medium Low Cases 


Tested early 
Tested late 
All cases 


5.3 
4.0 
4.05 


7.0 
49 
5.95 


43 
4.6 
4.45 


5.53 
4.50 
5.02 


Sarnoff A. 


Mednick 


groups by simply counting for each india 
the number of SG lamps eliciting os 
during the 40 test trials (SG pope ‘sake 
Score can only range from O to 10 an cast 
has the advantage of reducing the beg 
given to the aberrant individuals referred Le 
above. These SG scores were submitted u 
analysis of variance (Table 3) to test the T 
nificance of the differences between e 
ety groups. As can be seen from Table J 
groups differed significantly. However, eee 
tion of Tables 1 and 2 makes clear that io 
significance of the difference is mainly sam 
the MA group demonstrating conan 
more SG than the other two groups. jh Me 
and LA groups apparently generalize at a 
the same level, 

In view of the experimental proceda 
these results could either be attributed to 


Pi and 
MA group having more drive than the LA 
Table 3 -n 
Analysis of Variance of Stimulus Generalization 
F 
Source df MS 
4.17* 
Time of test 1 16.01 3,45" 
Anxiety groups 2 13.27 1.94 
nteraction: 2 7.47 
Time x Anxiet 
Within ý 54 3.84 
*b <.05, 


ed 

HA groups or to the fact that they were ‘Tai 
earlier in the quarter. At Bia ai logy 
versity, students in elementary psyc speri- 
Courses are required to serve for five T e 
ment hours and encouraged to serve UP tested 
ario. TOY the half of the MA group d es 
earlier, this Study was the first they ha roup 
Perienced, While the half of the MA “ree 
tested later had already served in two ‘tested 
Studies. For the HA and LA groups isis) 
earlier this Study was either the fifth, roups 
Or seventh, while for the half of these p i 
tested later this study was either the pe 
tenth the ad experienced. (The HA an 
groups were 


+ 50 
heavily tested that quarter, ti 
S voluntari] 


7 ur 

í Y Participated in up to fo 
Studies, ) «time 
. Tn order to evaluate the factor of » eac 
-quarter-at 


-which-SG-tests-took-place 


y 


Or 
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Generalization of Anxiety and Adaptation 
Table 4 
Percentage of Group Responding at Each Test Lamp f 
Test Lamps 
EA i ¢ @ e g & f ao Ai 
- Si $0 60 50 40 30 
50 60 S0 10 so 6 ; 
ETA oD 30 50 50 60 100 60 50 30 ao y 
an 30 35 50 55 70 100 70 55 40 2 
é J » 
i 70 0 50 80 50 
8 80 80 80 100 7 7 5 8 
FA T 20 50 50 70 100 80 60 50 s0 20 
al 4s 55 6 6 75 100 75 6 D as 5 
i 50 50 20 20 
OF 20 50 80 100 80 5 5 
ed A > 30 70 90 100 80 50 30 30 30 
Al 25 30 25 60 85 100 8 50 45 20 35 
i ivi TE = -tail test. 
Stoup was split into the half tested earlier sivity. (U = 22 5, p = .045, one-tail st.) 


first week for MA group, sixth week for HA 
E A groups) and the half tested later 
(third week for the MA group, ninth week for 
© HA and LA groups) in the quarter. The 
€dian number of SG responses for the early 
and late subgroups of the HA, MA, and LA 
‘oups are presented in Table 1. The early 
8roups are consistently more responsive. The 
Mean SG scores for the early and late sub- 
Sroups of the HA, MA, and LA groups are 
Presented in Table 2; the analysis of variance 
of the SG score data is presented in Table 3. 
he significance of the early-late variable sug- 
ests that time of testing in quarter influences 
amount of SG responsivity. The Anxiety 


Groups X Time of Testing interaction is not 
significant. 


Table 4 Presents t 


he data in terms of the 
percentage of the g 


roups that responded at 
each test lamp. As can be seen, the early sub- 


groups are Consistently more responsive than 


the late subgroups. The early HA Ss respond 
more than the early LA Ss, 


In view of the time of 
the results of the MA 
interpret, However, 
Were tested at the Same times. A Mann. 

hitney U Test comparing early HA with 
farly LA SG scores shows the ear} 


ly HA group 
“Monstrating significantly more SG respon- 


testing discrepancy, 


The late HA group and late LA group did not 
differ significantly. (U = 41.5, nis.) 

In summary, the results indicate that the 
HA and MA Ss that were tested early and had 
taken part in fewer previous Studies (naive 

generalize more than the HA 
and MA Ss who were tested later 
taken part in many studies (s 
The LA group showed i 
sophisticated and naive Ss, 
naive HA Ss generaliz 
than the naive LA gro 
Phisticated Ss this diffe 


ed significantly more 


ety may be elicited by “ 


4 Sarnoff A. 
49. 

? situational anxiety adapted out, leav- 
A e the late HA and the 
BA LA’ Ss. A similar adapting of the 

ae has been observed informally by Taylor 
YD and Spence (8). The adapting-out y 
pothesis would, of course, also explain the or 
ferences in SG between all the early Ss an 

a that some variables other than 
number of psychological experiments covary 
with “early” and “late” experimental sessions, 
For example, one of these could be the trans- 
mission of information between Ss. However, 


this is not seen as a likely occurrence. None of 
the “late” Ss expressed 


arousal, has impo 
cations for researc 


that use of experimentally Sophisti i 
viduals as Ss will ikeli 


ences between the HA and L 

tion of this variable May cast s 

otherwise inexplicable di 
Because of the 


original Purpose of the Study, i 


function of MAS scores 
has been put aside The results seem to indi- 
cate that HA Ss, whose low th 


um’s re- 
SG asa function 
of anxiety level (4). The Studies disagree jn 
d the differences 


“Summary and Conclusions 

Groups of high, medium, 
(HA, MA, and LA) Ss as 
Heineman forced-choice for 
Manifest Anxiety’ Scale ( 


and low anxious 
measured by the 


m of the Taylor 
MAS) were tested 


ES ee 


Mednick 


for stimulus generalization (SG). An beg 
pected finding—the MA Ss showed aia 
than HA and LA Ss—led to a = a 
of the data. The results of this tended ee 
port an interpretation which sees a hig sree 
score as indicating a low threshold for a Ai 
elicitation by a specific stress stimulus a ae 
posed to a chronic state. The results : paid 
that this low ger ee ou 
repeated experience in the situation. at 

A comparison of the HA and LA iar 
were relatively experimentally naive ini 


the 
that the HA group shows more SG than 
LA group. 
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Self Concepts in Adjusted and Maladjusted 
Hospital Patients’? 


Philip H. Chase* 


University of Colorado 


Difficulties in the development of satisfac- 
tory measures of adjustment level character- 
ize part of the criterion problem faced by re- 

~ Searchers investigating therapeutic processes. 
This paper is concerned with one aspect of the 
utility of the Q technique as a measure of ad- 
Justment, 

The comparison of self and ideal sorts has 
received much recent attention. Typically, the 
Correlation between S’s Q sorts for his con- 

= Cepts of “self” and “ideal self” is used as an 
index of adjustment level. A low or negative 
Correlation is assumed to indicate an unsatis- 
factory level of adjustment, and changes in 
the magnitude of correlation from before to 
after therapy are thus assumed to reflect 
Changes in level of adjustment. 

Dymond (2, p. 84) has pointed out that 
the procedure may be invalid due to the con- 
tamination of posttherapy sorts by the thera- 
Pist’s expressions of satisfaction with the pa- 
tient’s progress. Contamination of the thera- 
Pist’s ratings of success in treatment, often 
used as criteria, may also occur as a function 
of invalid “well-adjusted” self-references on 
the part of the patient. 


1 Adapted from a dissertation submitted in partial 
fulfillment of the requirements for the degree of Doc- 
tor of Philosophy, University of Colorado, 1956. The 
author wishes to thank Dr. Victor Raimy, Dr. 
Dorothy Martin, Dr. William Scott, and Dr. Michael 
Wertheimer, members of his committee, for their 
help, and also Dr. Howard Siple and Dr. Lewis 
Bernstein, Denver VA Hospital, for their encourage- 
ment’ and assistance in making possible the collec- 
tion of the data. 

2Published with permission of the Chief Medical 
Director, Department of Medicine and Surgery, Vet- 

— erans Administration, who assumes no responsibility 
for the opinions expressed or conclusions drawn by 


the author. , i 
3 Now at the Veterans Administration Hospital, St. 


Cloud, Minnesota. 


The need for study of the relation between 
self and ideal self, independent of the psycho- 
therapeutic situation, was obvious and led to 
the inception of the present research. 


Method 


The Ss were male, hospitalized veterans 
with at least an eighth-grade education. No 
Ss were included in whom central nervous 
system damage was considered as a possible 
diagnosis. All Ss were selected, according to 
their availability, within two weeks following 
their admission to the hospital. 

The “maladjusted” group consisted of three 
subgroups: 19 psychotics, 20 neurotics, and 
17 patients with character or personality dis- 
orders. The “adjusted” group consisted of 50 
patients without evidence of psychiatric diffi- 
culties who were hospitalized on medical or 
surgical wards. The “adjusted” group was di- 
vided into random halves. 

The “adjusted” and “maladjusted” groups 
did not differ significantly in mean age or edu- 
cation, nor was there a significant difference 
in marital status. 

All Ss were administered the 50 self-referring 
items in Hilden’s set number 13 (4) with instruc- 
tions to sort the items for their concepts of self, ideal 
self, and average other person. With the exception 
of minor changes, the instructions and sorting pro- 
cedures were those recommended by Hilden and in- 
volved a seminormalized, nine-category distribution, 
Intersort correlations for each S were determined by 
Hilden’s method. Mean correlations for each group 
were determined through transformation to Fisher’s 


Z scores. 

Three basic adjustment measures were de- 
rived from corrélations between: sorts for 
concepts of self and ideal self (S-I), sorts for 
concepts of self and of the average other per- 
son (S-AO), and sorts for concepts of ideal 
self and of the average other person (I-AQ), 

The self- and average-other-person sorts of 


495 


SS 


496 


one-half of the “adjusted” Ss were each aver- 
aged to yield mean “normal” sorts for both 
concepts. Three additional measures were thus 
available from the correlations between: sorts 
for the concept of self and the “normative” 
self-sort (S-NS), sorts for the concept of self 
and the “normative” average-other-person sort 
(S-NAO), and sorts for the concept of the 
average other person and the “normative” 
average-other-person sort (AO-NAO). Mean 
correlations for each group were determined 
as for the first three measures, 


justed” patients wh 
velopment of the « 


an correlati 
by the neurotics, and that t 


i e. Since 
tients were receiving fi 


ormal 
therapy at th, mes 


Results 


The first hypothesis wa: i 
firmed. The S-I, S.A0, SH Case 
mean correlations of all three “maladjusteq” 
groups, taken singly or together, were signifi 
cantly lower than those of the “adjusteq” 
group at the .01 level. No Significant differ- 
ences were observed for the LAO and AO- 
NAO measures. 

The data, although sug 


houg gestive of g trend in 
the expected direction, failed to confirm the 


een the sup. 


S-NAO measures, but did n 
nificance at even the .05 level. 

For the four discri 
Ss in the “maladjust 
vidual correlations of 


minating measures, only 
ed” group yielded indi- 
zero or negative magni- 


Philip H. Chase 


Table 1 


Mean Correlations for All Measures 


Ad- Total Char. Neu- Psy- 
justed maladj. dis. rotic be 
Measure (V=25) (V =56) (V=17) (V=20) (N= 


S-I 642 362 403 386 .296 
SAO S560 334 387 302 295 
I-AO .593 .566 .595 .590 .510 
S-NS 656 434 446 427 421 
SNAO 618 352 390 "343 327 
AO-NAO 648 “504 643 572,556 


tude. While the “adjusted” Ss saw themselves 
as being different in varying degrees from 
their ideal or from others, only Ss with psy- 
chiatric difficulties ever saw themselves, not 
only as being different from the ideal or from 
the average other person, but in some cases 
as tending toward the opposite. 5 

Both “adjusted” and “maladjusted” Ss 
tended to have similar conceptions of the 
ideal self and of the average other person. 

g 


Discussion 


Several limitations of the study should be 
noted, Firstly, the sampling cannot be as- 
sumed to he random since Ss were selected 
being available for reeni 
their initial two weeks E 
econdly, since all Ss wes 
> it is not known if similar a 
Ss might perform as those in the 
y did. Thirdly, it is not pe 
of the z statistic can be defende! 
a independence of elements may not be 
assumed, but it seemed to be the most ap- 
Propriate Statistical method to choose for a” 
analysis of the data, 
hi evertheless, the results suggest that psy- 
$ latrically maladjusted groups may be dis 
1 Suished from adjusted groups not only o 

e basis of the popular S-I measure, m 
equally wey by other milak measures Ma ; 
ME Use of self Sorts. It also appears to b 
feet that only those measures including 
oe self as a referent were capable of a X 
in leg ns In this study. While it may ae 
differ that the Concept of self is significan at 
A erent maladjusted persons, in contras 

“It conceptions of the ideal self or 


Ospitalizeq 
Present stud 
that the use 


s 


| 


| 
| 
| 
| 


‘ 


Self Concepts in Adjusted and Maladjusted Patients 


the avi 
oan ie then person, it remains to be 
vidual others pts relating to significant indi- 
be also ate such as parents or spouse, would 
Friedna ted by maladjustment. 
Sitelations s data (3) indicate that the S-I 
higher red of paranoid schizophrenics are 
ee those of neurotics and more like 
Phrenies normals. The nine paranoid schizo- 
OL perk in the present study, however, did 
. significantly differently from the 
the pr er of the psychotic group, a0 thus 
findin esent data fail to confirm Friedman’s 
fine ae An opposite trend is suggested. Fur- 
a Pied may indicate whether such factors 
a erent item content Or differences in 
trib icity and degree of illness will con- 
ute to contradictory results. It may be 


Significant that Friedman’s items can be said 
oriented than those 


S be more pathologically 
the present study. zriedman * has sug- 
ia that sampling 
ed to the contradictory 
St To summarize the findin 
udy, the “maladjusted” Ss, 
fp ceive the concepts of the 
the average other person much as the ad- 
justed” Ss did, tended to perceive themselves 
as quite different from their ideals and from 
their concepts of the average ot 
is suggested that this dissimilar per 


self reflects a realistic appr 
her selves jn contrast to the 


differences may have 


findings. 
gs of the present 
while tending to 


jdeal self and o: 


pontionee to ot Wa that 

eliefs of many who ho a 

turbed patients até incapable of such’ P- 
also be 


pothesis tha 
sociated W! 
Butler and H i 

operation: 


esteem, 
other conc 
remaining 
at least re 
adjusted Ss. 


er self-esteem in mal- 
e S-I nor the S-AO 
ccepted at this 
idity where self- 


4 Friedman, 
1956. 
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adjusted Ss may correctly perceive 
themselves as quite different from the gen- 


eralized “other.” Lastly, maladjusted Ss may 


sort items so as to yield spuriously high cor- 


relations. 
Throughout this paper the adjustment meas- 


ures have been discussed in terms of group 
performance. While the “adjusted” and ‘mal- 
adjusted” groups tended to have similar con- 
cepts of the ideal self and of the average other 
person, individual Ss occasionally produced 
sorts for these concepts quite deviant from 
the norm. It should be emphasized that esti- 
mates of adjustment level for single Ss may 
be distorted to the degree that ideal self- and 
average-other-person sorts, as well as self 
sorts, deviate from some adjusted group norm. 


Other well- 


Summary 
study represented an attempt 
chological maladjustment with 
six adjustment measures 
elf, of ideal self, and of 


the average other person. It was found that 
measures containing the self sort could 
group of “adjusted” from three 
d” hospitalized patients. 
themselves as being 
d from their con- 


The present 
to measure PSY' 
-sort data yielding 
utilizing concepts of S$ 


only 
discriminate 4 
groups of “maladjuste 
«Maladjusted” Ss saw 
different from their ideals an 
cepts of the average other person, while “ad- 
justed” Ss did not. Both “adjusted” and “mal- 
adjusted” Ss tended to hold similar concep- 
tions of the ideal self and of the average other 


person. 
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Change and Receptiveness to Psychotherapy’ 


Malcolm H, Robertson 


University of Mississippi 


Research has been largely focused on the 
process and evaluation of change during psy- 
chotherapy. If psychotherapy is a situation 
designed to hasten the process of change (1), 
then an equally pertinent problem is how 


ated to recep- 


ermine whether 
tionship between a 
cept of change and 
erapy. The term re- 


students who had never had 
would accept it. A fourth Stoup included 9 
subjects who had actually requested therapy 
but who had not had their first interview, A 
fifth group was comprised of 23 Students who 
were undergoing therapy. The Stoups were 


+An extended report of this study may be ob- 


tained without charge from Malcolm H, Robertson, 
Department of Psychology, University of Mississippi, 
University, Miss., or for a fee from th 
Documentation Institute. Order Do 
remitting $1.75 for microfilm oz 
copies. 


cument No, 


5360, 
$2.50 for p 


hoto- 


e American . 


comparable in terms of age, sex, education, 
and socioeconomic status. ight- 

The five groups were administered an bee 
item questionnaire which measured ii 
bal concept of change. The groups were one 
pared in terms of differences in the oe 
of four response categories. Chi-square y Te 
were computed to test the significance 0 
differences, 

The prediction that the two acceptans 
&roups would have a significantly nance 
global Concept of change than the oer Oe 
ance group was confirmed. The ee dif- 
that the two acceptance groups would ah ing 
fer from one another and that the vee aiffer 
therapy and therapy groups would not -edic- 
was also confirmed. Contrary to the P aiffer 
tions, the Nonacceptance group did not an 
Significantly from the beginning og oe 
therapy groups, and the beginning k see 
and therapy groups did have a ee a 
Weaker Concept of change than the accep ore 
Stoups. Furthermore, those who had had antl 
than 16 ¢ erapy sessions had a ray og 
Weaker concept of change than gid mhosg 
had had less than 16 therapy sessions. f the 

Results were evaluated both in terms 0 the- 
methodological Procedures and the hyp igh 
Sized relation between concept of change 
receptiveness to psychotherapy. 

Brief Report, 
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A Stereophonic Sound System for Play 
Therapy Observation 


Harold R. Green, John J. Hanson, and Julius Seeman 
George Peabody College for Teachers 


The purpose of this paper is to describe a 


^ sound system built to monitor play therapy 


interviews. 

Research in play therapy is at best a com- 
plex process; meaning must be derived from 
a subtle interplay of verbal and nonverbal be- 
havior. Often the problem is complicated by 
the sheer difficulty in hearing what is going 
on. The ordinary auditory problems in moni- 
toring are compounded in play therapy by at 
least three factors: (a) Children’s voices are 
sometimes quite low in volume and high in 
Pitch. (b) Children often talk and play si- 
multaneously. The clank of metal or the 
Splash of water may mask quite effectively 
what the child and therapist are saying. (c) 
The need to have a room which can take 
rough usage usually leads to the use of hard 
materials which increase echo. Fiber board 
and low-hanging drapes, both good for sound 
absorbency, are conspicuously absent from 
most play therapy rooms for good reasons. 

When the play therapy room at Peabody 
was built, the Audio-Visual department in- 
stalled the microphone and speakers which 
composed the single system sound pickup. 
The system had the limitations described 
above. Further consultation with the Audio- 
Visual department brought forth the sugges- 
tion to try the use of a binaural or dual sys- 
tem. This system has advantages which are 
quite critical in their value for monitoring 
play therapy from an adjoining room in con- 
junction with the use of a one-way vision 
mirror. 

The conventional one-channel sound system 
feeds all sound into a single input and ampli- 


fier, Thus all sounds compete directly for at- 


tention. This factor reduces the possibility of 
sound selection and sound localization. 

A stereophonic system reduces these diffi- 
culties. The basic plan of this system is to 
provide paired but separate lines from the 
sound to the listener, i.e., paired microphones, 
paired amplifiers, and paired earphones. This 
two-channel system allows the listener to 
bring into greater play one of the major fac- 
tors responsible for sound localization, namely, 
the phase or temporal factor. The two-channel 
input, properly placed, can make approxi- 
mately the same time discriminations as the 
human ears in picking up sound. This factor 
gives the listener a more “natural” situation 
and allows him to exercise the usual selective 
attention to sounds, “tuning out” such sounds 
as the scraping of furniture or the running of 
water when he wishes. It also leads to greater 
fidelity of sound reproduction. 

Licklider describes the effect of a two- 
channel system as follows: 

The idea of the simplest two-channel scheme is to 
record the sounds with two microphones, placed in 
the positions of the two ears of a dummy listener. 
When these sounds are led to a remote listener and 
applied to his ears via earphones, he gets almost ex- 
actly the same auditory picture that he would get 
if he were in the dummy’s position. The effect is very 
compelling. When someone with hobnailed boots 
walks past the dummy, the listener pulls in his feet 
to keep them from getting stepped on (1, p. 1030). 


Our system is similar to the one Licklider 
describes. It has the effect, then, of placing 
the listener aurally within the playroom itself, 


The technical requirements of the’ system 
are as follows: 


1. Paired microphones, placed horizontally 
12-15 inches apart and separated by a piece 
of fiberboard to reduce sound overlap. 
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2. Separate wires running from the micro- 
phones to paired amplifiers of characteristics 
similar -to each other. Each microphone leads 
to one amplifier. 

3. Separate wires leading from amplifiers to 
earphones, one wire going to each earpiece. 
So long as the wires are kept separate, it is 
possible to use as many earphones as are 
needed for observation. Our system contains 
15 pairs of earphones, each with its own 
volume control so that the listener can adjust 
the sound to his own hearing comfort, The in- 
dividual volume controls are useful but not 


Harold R. Green, John J. Hanson, and Julius Seeman 


left ear the earpiece leading to the right-hand 
microphone. This error cancels out localiza- 


tion, at least temporarily, though it does not 
impair audibility. 


Summary 


This article describes a two-channel sound 
system used to improve sound reception in 
monitoring play therapy interviews. 
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A Short Forced-Choice Anxiety Scale’ 


Richard Christie and Stanley Budnitzky 


Columbia University 


Bendig (1) has pointed out that but 20 of 
the 50 original items in the Taylor Anxiety 
Scale appear to have clinical validity, and pre- 
sented evidence that the reliability of these 
items is almost as high as of those in the origi- 
nal scale. Heineman (3) constructed forced- 
choice forms of the Taylor scale which largely 
eliminated extraneous variance traceable to re- 
sponse set and social acceptability. 

The advantages of both investigators’ work 
have been combined in a short forced-choice 
scale. Heineman’s item format was utilized 
for the 20 items designated by Bendig. Each 
anxiety item appears in ə, triplet. We fol- 
lowed Heineman’s procedure of having each 
S indicate which one of the three items was 
most true and which one was least true of 
self. The scoring differed slightly: designa- 
tion of a positively worded anxiety item as 
most true was scored plus one, designation as 
least true as minus one. Negatively stated 
items, i.e., acceptance was indicative of a lack 
of anxiety, were scored in opposite fashion. A 
constant of 20 was added to raw scores mak- 
ing the possible range from zero to 40. 

This scale has been given to four classes of 
medical students.? The split-half reliabilities 


1An extended report of this study, including the 
short scale, may be obtained without charge from 
Richard Christie, 605 West 115th Street, New York 
25, N. Y., or for a fee from the American Docu- 
mentation Institute. Order Document No. 5317, re- 
mitting $1.25 for microfilm or $1.25 for photocopies. 

2 This scale was developed in the course of research 
on the relationship between personality variables and 
performance among medical school students. This is 
a part of ongoing studies in the Sociology of Medi- 
cal Education by the Bureau of Applied Social Re- 
search of Columbia University under a grant from 


the Commonwealth Fund. 


range from .65 to .84, the mean reliability of 
-75 comparing with that of .76 reported by 
Bendig. Heineman found lower reliabilities for 
one form of scoring and higher for another on 
50 items. 

The validity of the revised scale rests pri- 
marily upon Bendig’s synthesis of the work of 
Buss (2) and of Hoyt and Magoon (4). One 
inferential indication of validity comes from 
one class where pooled ratings of student per- 
formance in clinics were available.’ Twelve of 
the 73 students were rated as outstanding and 
had significantly lower (.01 by Fisher’s exact 
test) scores on the scale than the nine poorest 
performers. Such a finding is consistent with 
the hypothesized relationship between anxiety 
and performance in complex situations. 

Inasmuch as the present modification com- 
bines the desirable features of both Heine- 
man’s and Bendig’s scales, its use is suggested 
when a short forced-choice anxiety scale is 
desired. 


Brief Report. 
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A Note on the Use of Doppelt’s Short Form of the 
WAIS with Psychiatric Patients’ 


David M. Sterne 
VA Hospital, Vancouver, Washington 


In February, 
the development 
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The use of the 


8 brain 
S hospital, 
C Cases Carried 


+From VA Hospital, Vancouver, Washington, 


schizophrenic diagnoses; the remainder m 
mainly psychoneurotic reactions or Pe uai 
ity disorders, The neurological cases inc a 
epileptics, traumatic brain damage tics, 
cerebral arteriosclerotics, multiple aes in 
and similar conditions with organic bra 
involvement, of 
Correlation Coefficients between the sell 
Scaled scores on the four tests and the m- 
Scale scores were .97 for the psychiatric E 
ple, .88 for the neurological sample, and es- 
for the two combined. Standard err BE o 
timate were 6:5, 8.0, and 7.1, respective se 
Doppelt’s short form of the WAIS Real 
to be applicable and useful with male Pees 
atric patients in a general medical and ae 
cal hospital, Although the number of ne tive 
Ogical cases jg small, the results are pane 
that prediction with such cases would be “hose 
what less accurate than with individuals w 
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Performance is unhampered by organic b 
involvement. 
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The Kahn Test of Symbol Arrangement as an 
Aid to Psychodiagnosis 


Paul D. Murphy, M. Richard Ferriman, and Russell W. Bolinger * 
phy, g 
USAF Hospital, Wright-Patterson Air Force Base, Ohio 


Five years have passed since Shoben re- 
viewed the Kahn Test of Symbol Arrange- 
ment and, on the strength of two publications 
(3, 8), wrote: “. . . the KTSA (Kahn Test 
of Symbol Arrangement) is a simpler, more 
widely applicable situation than most instru- 
ments on hand for investigation of develop- 
mental patterns and various attributes of 
psychopathological behavior. On a research 
basis it should be strongly encouraged. As a 
test it is still essentially unproven” (15, p. 
111). : 

Shoben’s appeal for research with this in- 
strument may have played a part in stimu- 
lating subsequent studies which ranged from 
an investigation of epileptic children (1) toa 
study of parolees from a maximum security 
prison (5). The new clinical manual for the 
Kahn test describes fourteen recent studies 
which lend support to the early prediction of 
the instrument’s capacity to investigate psy- 
chopathology (13). Among those which in- 
terested the present authors are two studies 
in which blind psychodiagnosis was attempted 
by means of symbol patterns the test yields 
when the new method of scoring the test is 
employed (11, 12). By means of these pat- 
terns Kahn, Harter, Rider, and Lum claim 
they were able to differentially identify non- 
psychotics, schizophrenics, and patients with 
organic brain disease. In that study, 71.87% 
of an unknown group of 170 subjects were 
correctly classified into the above mentioned 
categories by means of blind sorting accom- 
plished by individuals who had had no train- 


1 Dr. Murphy was the chief of the neuropsychiatric 
service and the mental hygiene clinic at the time this 
study was conducted. The other authors were mem- 
bers of the neuropsychiatric team. 


ing in psychology or psychiatry (13, pp. 112- 
117). Another study reported in the same 
publication describes the successful identifica- 
tion of neurotics, normals, borderline schizo- 
phrenics, psychopaths, and psychotics by 
means of the Kahn test symbol patterns alone 
(13, pp. 117-119 and pp. 153-160). 


Method 


This study presents a modest attempt to 
repeat some of the earlier validation attempts 
in which blind analysis by means of symbol 
pattern was accomplished. 

The test consists of fifteen plastic objects 
which the subject must arrange five times on 
a felt strip having consecutively numbered 
segments ranging from 1 to 15. The adminis- 
tration and test materials are described in an 
earlier publication (8). 

The present study used a sample consisting 
of an unselected group of 48 patients who 
were Classified into one of four categories by 
forced choice, using the patient’s Kahn sym- 
bol pattern and no other data. The group’s 
mean age was 24 with an SD of 8.7. The edu- 
cational level was 11.3 with an SD of 4.2. All 
of the patients were members of the military 
service; the majority were enlisted men. 


The symbol pattern consists of a number score and 
a series of letters. The number score represents 
weights derived by Kahn from ¢ ratio comparisons of 
clinical with normal groups (14). The letters repre- 
sent different levels of abstraction—or “symboliza- 
tion” as Kahn calls it (12). Frequency of occurrence 
determines the serial position of the letter in the 
pattern. The pattern can easily be derived from the 
psychograph which appears on the Individual Record 
Sheet furnished by the test publisher (12). 

The procedure of the study was as follows: Upon 
referral, patients were tested with the Kahn Test of 
Symbol Arrangement before any diagnosis was es- 
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2 Drs. Harry L. MacKinnon 
sistent professors in psychiatry 
University of Cincinnati. 


and Arnold Allen, as- 
> School of Medicine, 


Table 1 


i by Kahn 
i ting of Mental Patients 
— “symbol Patterns 


Sorting by Symbol Patterns 
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jee N ¢ © $ > 
N ll 3 0 3 17 
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o 0 0 4 0 10 
S 0 2 0 8 
48 
Total 11 20 4 13 
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one study, schizophrenics were not satisfactorily sep- 
arated from either normals or the brain damaged; in 
another all psychotics including organics were thrown 
together in unspecified proportions. Studies seem to 
give conflicting evidence as to whether neurotics are 
differentiated to a clinically useful degree. In no 
study was attention given to the troublesome prob- 

Jem of the base rate. If the base rate for psychosis is 
19%, the test probably yields more false positives than 

true positives. ‘ 
ig The projective interpretations of the instrument 
are stated with becoming modesty, eg., “slanting 
hearts may indicate hostility to the opposite sex.” As 
with many other projective methods, such interpre- 
tations spring from clinical sense and may well lead 
to useful idiographic hypotheses. They are supported 
by no evidence. 

Although the KTSA has been under development 
= for about ten years, it shows many signs of still being 
= in a process of evolution, For example, the rules for 

interpretation given in the manual are not those used 
jin any reported research study. The symbol pattern 
interpretations stated on a reference card supplied 
with the set even differ a little from those in the 
manual. The test is clearly an interesting device for 


i ferentiating brain damaged cases from normals. In 
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further research, but it is not yet ready for unquali- 
fied use—L. F. S. 


University of Pennsylvania, School of Education, 
Group B of the Suburban School Study Council. 
Pupil Adjustment Inventory. Rating scale for 
teachers’ use in rating pupils. 2 forms. Test pack- 
age; 35 short forms and 5 long forms, with man- 
ual, pp. 17 ($3.60); specimen set (80¢). Boston: 
Houghton Mifflin, 1957. 


The Pupil Adjustment Inventory provides a sys- 
tematic means by which teachers may rate the edu- 
cational and personal adjustment of elementary school 
and high school pupils. The short form, of 15 items, 
is for surveys, the long form, of 55 items, is intended 
for more detailed individual study. Each item is a 
five-step rating scale with each step defined by a 
brief verbal description. The content and form of the 
items seem sound. The Rater’s Manual gives adequate 
instructions for the use of the scales and anecdotes 
supporting their value; it is deficient in that it pro- 
vides no statistical data. The best use of the scales 
will be to increase the involvement of teachers in the 
thoughtful consideration of their pupils as persons, 
—L. F. S. 
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